Re: [PATCH v2] string_helpers: fix precision loss for some inputs

2015-11-04 Thread Rasmus Villemoes
On Wed, Nov 04 2015, James Bottomley  
wrote:

> On Wed, 2015-11-04 at 00:26 +0100, Rasmus Villemoes wrote:
>> On Tue, Nov 03 2015, James Bottomley  
>> wrote:
>> >> Please spell it U32_MAX
>> >
>> > Why?  there's no reason not to use the arithmetic UINT_MAX here.  Either
>> > works, of course but UINT_MAX is standard.
>> 
>> We're dealing with explicitly sized integers
>
> An integer is explicitly sized: it's 32 bits. That's why UINT_MAX is
> a universal constant.

In the Linux universe, yes. It's kind of amusing how you try to argue
based on the UINT_MAX name being (defined in a) standard while at the same
time very much rely on sizeof(int) having a value which is not specified
by the standard. I repeat:

>> U32_MAX is the natural name for the appropriate constant.

(and it's defined right next to UINT_MAX in kernel.h, so it's not like
you'd have to introduce that macro).

>> Also, you could do > U32_MAX instead of >= U32_MAX, but that's unlikely
>> to make any difference (well, except it might generate slightly better
>> code, since it would allow gcc to just test the upper half for being 0,
>> which might be cheaper on some architectures than comparing to a
>> literal).
>
> Heh if we're going to be that concerned about the code generation, then
> we should just tell gcc exactly how to do it instead of hoping it can
> work it out for itself, so
>
> while (blk_size >> 32) {
> ...

Nah, that would still require the compiler to be able to transform that
to the other form, which apparently it isn't. On x86_64, the simplest
is to load U32_MAX once into a register and then do r/r comparisons, but
when it's possible to directly test the upper half (e.g. when the 64 bit
value is represented in a pair of 32 bit registers) that's much
simpler. gcc generates good code for 'blk_size > U32_MAX' on both x86_64
and x86_32, but ends up doing an extra cmp on x86_32 for >=, and ends up
doing mov,shift,test inside the loop on x86_64 for 'blk_size >> 32'.

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] string_helpers: fix precision loss for some inputs

2015-11-03 Thread James Bottomley
On Wed, 2015-11-04 at 00:26 +0100, Rasmus Villemoes wrote:
> On Tue, Nov 03 2015, James Bottomley  
> wrote:
> 
> > On Tue, 2015-11-03 at 23:13 +0100, Rasmus Villemoes wrote:
> >> On Tue, Nov 03 2015, James Bottomley 
> >>  wrote:
> >> 
> >> > From: James Bottomley 
> >> >
> >> > It was noticed that we lose precision in the final calculation for some
> >> > inputs.  The most egregious example is size=3000 blk_size=1900 in units 
> >> > of 10
> >> > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because 
> >> > the
> >> > current algorithm doesn't correctly account for all the remainders in the
> >> > logarithms.  Fix this by doing a correct calculation in the remainders 
> >> > based
> >> > on napier's algorithm.  Additionally, now we have the correct result, we 
> >> > have
> >> > to account for arithmetic rounding because we're printing 3 digits of
> >> > precision.  This means that if the fourth digit is five or greater, we 
> >> > have to
> >> > round up, so add a section to ensure correct rounding.  Finally account 
> >> > for
> >> > all possible inputs correctly, including zero for block size.
> >> >
> >> > Reported-by: Vitaly Kuznetsov 
> >> > Cc: sta...@vger.kernel.org   # delay backport by two months for 
> >> > testing
> >> > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
> >> > Signed-off-by: James Bottomley 
> >> >
> >> > --
> >> >
> >> > v2: updated with a recommendation from Rasmus Villemoes to truncate the
> >> > initial precision at just under 32 bits
> >> >
> >> > diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> >> > index 5939f63..363faca 100644
> >> > --- a/lib/string_helpers.c
> >> > +++ b/lib/string_helpers.c
> >> > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const 
> >> > enum string_size_units units,
> >> >  [STRING_UNITS_10] = 1000,
> >> >  [STRING_UNITS_2] = 1024,
> >> >  };
> >> > -int i, j;
> >> > -u32 remainder = 0, sf_cap, exp;
> >> > +static const unsigned int rounding[] = { 500, 50, 5, 0};
> >> 
> >> j necessarily ends up being 0, 1 or 2. Any reason to include the last 
> >> entry?
> >
> > No reason beyond a vague worry someone might try to increase the printed
> > precision by one digit.
> 
> But that would seem to require prepending 5000 to that array and
> changing various constants below to 1 (aside from checking all
> callers to see if they pass a sufficient buffer size) - the 0 doesn't
> serve any purpose in that scenario either.
> 
> >> > +
> >> > +while (blk_size >= UINT_MAX)
> >> >  i++;
> >> > -}
> >> >  
> >> > -exp = divisor[units] / (u32)blk_size;
> >> > -/*
> >> > - * size must be strictly greater than exp here to ensure that 
> >> > remainder
> >> > - * is greater than divisor[units] coming out of the if below.
> >> > - */
> >> > -if (size > exp) {
> >> > -remainder = do_div(size, divisor[units]);
> >> > -remainder *= blk_size;
> >> > +while (size >= UINT_MAX)
> >> >  i++;
> >> 
> >> Please spell it U32_MAX
> >
> > Why?  there's no reason not to use the arithmetic UINT_MAX here.  Either
> > works, of course but UINT_MAX is standard.
> 
> We're dealing with explicitly sized integers

An integer is explicitly sized: it's 32 bits.  That's why UINT_MAX is a
universal constant.

>  and the comment even says
> that we're reducing till we fit in 32 bits, so that we can do a
> 32x32->64 multiplication. U32_MAX is the natural name for the
> appropriate constant.
> 
> Also, you could do > U32_MAX instead of >= U32_MAX, but that's unlikely
> to make any difference (well, except it might generate slightly better
> code, since it would allow gcc to just test the upper half for being 0,
> which might be cheaper on some architectures than comparing to a
> literal).

Heh if we're going to be that concerned about the code generation, then
we should just tell gcc exactly how to do it instead of hoping it can
work it out for itself, so

while (blk_size >> 32) {
...

James

> Rasmus
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] string_helpers: fix precision loss for some inputs

2015-11-03 Thread Rasmus Villemoes
On Tue, Nov 03 2015, James Bottomley  
wrote:

> On Tue, 2015-11-03 at 23:13 +0100, Rasmus Villemoes wrote:
>> On Tue, Nov 03 2015, James Bottomley  
>> wrote:
>> 
>> > From: James Bottomley 
>> >
>> > It was noticed that we lose precision in the final calculation for some
>> > inputs.  The most egregious example is size=3000 blk_size=1900 in units of 
>> > 10
>> > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
>> > current algorithm doesn't correctly account for all the remainders in the
>> > logarithms.  Fix this by doing a correct calculation in the remainders 
>> > based
>> > on napier's algorithm.  Additionally, now we have the correct result, we 
>> > have
>> > to account for arithmetic rounding because we're printing 3 digits of
>> > precision.  This means that if the fourth digit is five or greater, we 
>> > have to
>> > round up, so add a section to ensure correct rounding.  Finally account for
>> > all possible inputs correctly, including zero for block size.
>> >
>> > Reported-by: Vitaly Kuznetsov 
>> > Cc: sta...@vger.kernel.org # delay backport by two months for testing
>> > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
>> > Signed-off-by: James Bottomley 
>> >
>> > --
>> >
>> > v2: updated with a recommendation from Rasmus Villemoes to truncate the
>> > initial precision at just under 32 bits
>> >
>> > diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> > index 5939f63..363faca 100644
>> > --- a/lib/string_helpers.c
>> > +++ b/lib/string_helpers.c
>> > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const 
>> > enum string_size_units units,
>> >[STRING_UNITS_10] = 1000,
>> >[STRING_UNITS_2] = 1024,
>> >};
>> > -  int i, j;
>> > -  u32 remainder = 0, sf_cap, exp;
>> > +  static const unsigned int rounding[] = { 500, 50, 5, 0};
>> 
>> j necessarily ends up being 0, 1 or 2. Any reason to include the last entry?
>
> No reason beyond a vague worry someone might try to increase the printed
> precision by one digit.

But that would seem to require prepending 5000 to that array and
changing various constants below to 1 (aside from checking all
callers to see if they pass a sufficient buffer size) - the 0 doesn't
serve any purpose in that scenario either.

>> > +
>> > +  while (blk_size >= UINT_MAX)
>> >i++;
>> > -  }
>> >  
>> > -  exp = divisor[units] / (u32)blk_size;
>> > -  /*
>> > -   * size must be strictly greater than exp here to ensure that remainder
>> > -   * is greater than divisor[units] coming out of the if below.
>> > -   */
>> > -  if (size > exp) {
>> > -  remainder = do_div(size, divisor[units]);
>> > -  remainder *= blk_size;
>> > +  while (size >= UINT_MAX)
>> >i++;
>> 
>> Please spell it U32_MAX
>
> Why?  there's no reason not to use the arithmetic UINT_MAX here.  Either
> works, of course but UINT_MAX is standard.

We're dealing with explicitly sized integers, and the comment even says
that we're reducing till we fit in 32 bits, so that we can do a
32x32->64 multiplication. U32_MAX is the natural name for the
appropriate constant.

Also, you could do > U32_MAX instead of >= U32_MAX, but that's unlikely
to make any difference (well, except it might generate slightly better
code, since it would allow gcc to just test the upper half for being 0,
which might be cheaper on some architectures than comparing to a
literal).

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] string_helpers: fix precision loss for some inputs

2015-11-03 Thread James Bottomley
On Tue, 2015-11-03 at 23:13 +0100, Rasmus Villemoes wrote:
> On Tue, Nov 03 2015, James Bottomley  
> wrote:
> 
> > From: James Bottomley 
> >
> > It was noticed that we lose precision in the final calculation for some
> > inputs.  The most egregious example is size=3000 blk_size=1900 in units of 
> > 10
> > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
> > current algorithm doesn't correctly account for all the remainders in the
> > logarithms.  Fix this by doing a correct calculation in the remainders based
> > on napier's algorithm.  Additionally, now we have the correct result, we 
> > have
> > to account for arithmetic rounding because we're printing 3 digits of
> > precision.  This means that if the fourth digit is five or greater, we have 
> > to
> > round up, so add a section to ensure correct rounding.  Finally account for
> > all possible inputs correctly, including zero for block size.
> >
> > Reported-by: Vitaly Kuznetsov 
> > Cc: sta...@vger.kernel.org  # delay backport by two months for testing
> > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
> > Signed-off-by: James Bottomley 
> >
> > --
> >
> > v2: updated with a recommendation from Rasmus Villemoes to truncate the
> > initial precision at just under 32 bits
> >
> > diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> > index 5939f63..363faca 100644
> > --- a/lib/string_helpers.c
> > +++ b/lib/string_helpers.c
> > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum 
> > string_size_units units,
> > [STRING_UNITS_10] = 1000,
> > [STRING_UNITS_2] = 1024,
> > };
> > -   int i, j;
> > -   u32 remainder = 0, sf_cap, exp;
> > +   static const unsigned int rounding[] = { 500, 50, 5, 0};
> 
> j necessarily ends up being 0, 1 or 2. Any reason to include the last entry?

No reason beyond a vague worry someone might try to increase the printed
precision by one digit.

> > +
> > +   while (blk_size >= UINT_MAX)
> > i++;
> > -   }
> >  
> > -   exp = divisor[units] / (u32)blk_size;
> > -   /*
> > -* size must be strictly greater than exp here to ensure that remainder
> > -* is greater than divisor[units] coming out of the if below.
> > -*/
> > -   if (size > exp) {
> > -   remainder = do_div(size, divisor[units]);
> > -   remainder *= blk_size;
> > +   while (size >= UINT_MAX)
> > i++;
> 
> Please spell it U32_MAX

Why?  there's no reason not to use the arithmetic UINT_MAX here.  Either
works, of course but UINT_MAX is standard.

> . Also, it's not clear why you left out the
> do_divs ;-)

Over reduction.

James


> Rasmus
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] string_helpers: fix precision loss for some inputs

2015-11-03 Thread James Bottomley
On Tue, 2015-11-03 at 13:37 -0800, Bart Van Assche wrote:
> On 11/03/2015 01:21 PM, James Bottomley wrote:
> > +   while (blk_size >= UINT_MAX)
> > i++;
> 
> (reduced CC-list)

Let's keep at least the lists in cc.

> Hello James,
> 
> Is the above loop an infinite loop if blk_size >= UINT_MAX ?

Yes, a little too much truncation over the original.  I'll post a v3.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] string_helpers: fix precision loss for some inputs

2015-11-03 Thread Rasmus Villemoes
On Tue, Nov 03 2015, James Bottomley  
wrote:

> From: James Bottomley 
>
> It was noticed that we lose precision in the final calculation for some
> inputs.  The most egregious example is size=3000 blk_size=1900 in units of 10
> should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
> current algorithm doesn't correctly account for all the remainders in the
> logarithms.  Fix this by doing a correct calculation in the remainders based
> on napier's algorithm.  Additionally, now we have the correct result, we have
> to account for arithmetic rounding because we're printing 3 digits of
> precision.  This means that if the fourth digit is five or greater, we have to
> round up, so add a section to ensure correct rounding.  Finally account for
> all possible inputs correctly, including zero for block size.
>
> Reported-by: Vitaly Kuznetsov 
> Cc: sta...@vger.kernel.org# delay backport by two months for testing
> Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
> Signed-off-by: James Bottomley 
>
> --
>
> v2: updated with a recommendation from Rasmus Villemoes to truncate the
> initial precision at just under 32 bits
>
> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
> index 5939f63..363faca 100644
> --- a/lib/string_helpers.c
> +++ b/lib/string_helpers.c
> @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum 
> string_size_units units,
>   [STRING_UNITS_10] = 1000,
>   [STRING_UNITS_2] = 1024,
>   };
> - int i, j;
> - u32 remainder = 0, sf_cap, exp;
> + static const unsigned int rounding[] = { 500, 50, 5, 0};

j necessarily ends up being 0, 1 or 2. Any reason to include the last entry?

> +
> + while (blk_size >= UINT_MAX)
>   i++;
> - }
>  
> - exp = divisor[units] / (u32)blk_size;
> - /*
> -  * size must be strictly greater than exp here to ensure that remainder
> -  * is greater than divisor[units] coming out of the if below.
> -  */
> - if (size > exp) {
> - remainder = do_div(size, divisor[units]);
> - remainder *= blk_size;
> + while (size >= UINT_MAX)
>   i++;

Please spell it U32_MAX. Also, it's not clear why you left out the
do_divs ;-)

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] string_helpers: fix precision loss for some inputs

2015-11-03 Thread James Bottomley
From: James Bottomley 

It was noticed that we lose precision in the final calculation for some
inputs.  The most egregious example is size=3000 blk_size=1900 in units of 10
should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
current algorithm doesn't correctly account for all the remainders in the
logarithms.  Fix this by doing a correct calculation in the remainders based
on napier's algorithm.  Additionally, now we have the correct result, we have
to account for arithmetic rounding because we're printing 3 digits of
precision.  This means that if the fourth digit is five or greater, we have to
round up, so add a section to ensure correct rounding.  Finally account for
all possible inputs correctly, including zero for block size.

Reported-by: Vitaly Kuznetsov 
Cc: sta...@vger.kernel.org  # delay backport by two months for testing
Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
Signed-off-by: James Bottomley 

--

v2: updated with a recommendation from Rasmus Villemoes to truncate the
initial precision at just under 32 bits

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 5939f63..363faca 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum 
string_size_units units,
[STRING_UNITS_10] = 1000,
[STRING_UNITS_2] = 1024,
};
-   int i, j;
-   u32 remainder = 0, sf_cap, exp;
+   static const unsigned int rounding[] = { 500, 50, 5, 0};
+   int i = 0, j;
+   u32 remainder = 0, sf_cap;
char tmp[8];
const char *unit;
 
tmp[0] = '\0';
-   i = 0;
-   if (!size)
+
+   if (blk_size == 0)
+   size = 0;
+   if (size == 0)
goto out;
 
-   while (blk_size >= divisor[units]) {
-   remainder = do_div(blk_size, divisor[units]);
+   /* This is napier's algorithm.  Reduce the original block size to
+*
+* coefficient * divisor[units]^i
+*
+* we do the reduction so both coefficients are just under 32 bits so
+* that multiplying them together won't overflow 64 bits and we keep
+* as much precision as possible in the numbers
+*/
+
+   while (blk_size >= UINT_MAX)
i++;
-   }
 
-   exp = divisor[units] / (u32)blk_size;
-   /*
-* size must be strictly greater than exp here to ensure that remainder
-* is greater than divisor[units] coming out of the if below.
-*/
-   if (size > exp) {
-   remainder = do_div(size, divisor[units]);
-   remainder *= blk_size;
+   while (size >= UINT_MAX)
i++;
-   } else {
-   remainder *= size;
-   }
+
+   /* now perform the actual multiplication keeping i as the sum of the
+* two logarithms */
 
size *= blk_size;
-   size += remainder / divisor[units];
-   remainder %= divisor[units];
 
+   /* and logarithmically reduce it until it's just under the divisor */
while (size >= divisor[units]) {
remainder = do_div(size, divisor[units]);
i++;
@@ -84,9 +86,20 @@ void string_get_size(u64 size, u64 blk_size, const enum 
string_size_units units,
for (j = 0; sf_cap*10 < 1000; j++)
sf_cap *= 10;
 
+   /* express the remainder as a decimal (it's currently the numerator of
+* a fraction whose denominator is divisor[units]) */
+   remainder *= 1000;
+   remainder /= divisor[units];
+
+   /* add a 5 to the digit below what will be printed to ensure
+* an arithmetical round up and carry it through to size */
+   remainder += rounding[j];
+   if (remainder >= 1000) {
+   remainder -= 1000;
+   size += 1;
+   }
+
if (j) {
-   remainder *= 1000;
-   remainder /= divisor[units];
snprintf(tmp, sizeof(tmp), ".%03u", remainder);
tmp[j+1] = '\0';
}


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html