Re: [Jprogramming] Round to nearest integer: harder than it seems

Devon McCormick Thu, 14 Jan 2016 16:34:24 -0800

Personally, I avoid paying attention to internal data types whenever
possible as this leads us down an implementation-dependent rabbit hole with
little upside.


My own rounding utility "roundNums" - found here:
http://code.jsoftware.com/wiki/User:Devon_McCormick/Utilities - allows
options for different kinds of rounding - up, down, and banker's - for the
cases where we're on an exact halfway point.  Banker's rounding - also
referenced here http://code.jsoftware.com/wiki/Addons/general/misc -
attempts to avoid systematic bias by rounding either up or down.  This
would be important, e.g., in the case of a series resulting from the
average of two series of whole numbers where rounding either up or down
would introduce directional bias.

Also, I like my way of specifying the rounding precision with a minimum
increment, i.e. "1" for whole numbers or "0.05" for nickels, as this is
intuitive and flexible.

On Thu, Jan 14, 2016 at 7:01 PM, Marshall Lochbaum <[email protected]>
wrote:

> Actually, you're off by one on the low end--a quirk of two's complement
> notation is that the minimum possible value is actually one less than
> minus the maximum. But the bigger problem is that doing things with rank
> zero is very slow. I was working with audio data where there are
> typically almost a hundred thousand values per second (and, admittedly,
> where any more than 24 bits in the output is overkill), so this wouldn't
> be fast enough.
>
> Marshall
>
> On Thu, Jan 14, 2016 at 09:39:20PM +0000, 'Pascal Jasmin' via Programming
> wrote:
> > anything wrong with this?
> >
> > (0.5 <.@:+ ])`(9223372036854775807 * *)@.(9223372036854775807 < |)("0) _
> 234234.3 __
> >
> > works as an adverb where you provide your rounding function.
> >
> >
> > roundA =: `(9223372036854775807 * *)(@.(9223372036854775807 < |))("0)
> >
> > (0.5 <.@:+ ]) roundA _ 234234.3 __
> > >. roundA _ 234234.3 __
> >
> > ----- Original Message -----
> > From: Marshall Lochbaum <[email protected]>
> > To: [email protected]
> > Sent: Thursday, January 14, 2016 2:56 PM
> > Subject: [Jprogramming] Round to nearest integer: harder than it seems
> >
> > Here's something I spent far too long on, and consequently thought was
> > worth sharing. I can turn it into an essay on the J wiki if people want
> > that.
> >
> > Recently I ran into the problem of rounding a J floating point number to
> > an integer, and forcing the result to have integer type. This seems like
> > a simple task: using a standard rounding function, we have
> >    0.5 <.@:+ 2.3 5.1 7.6 3.9
> > 2 5 8 4
> > But with numbers that are too large, the result still contains
> > floating-point numbers, and has type 8 (floating point) rather than 4
> > (integer).
> >    0.5 <.@:+ _1e50 2e30 _
> > _1e50 2e30 _
> >    3!:0 ]0.5 <.@:+ _1e50 2e30 _
> > 8
> > When applied to a float, (<.) applies the C floor function, which yields
> > another float, and than casts the results to integers if all of them are
> > exactly representable as integers. They're not here, so they are left as
> > floating-point numbers.
> >
> > To give a more accurate problem statement, I want the 64-bit signed
> > integer which is closest to the function input. Thus numbers above the
> > maximum representable integer should round to that integer, and likewise
> > for numbers below the minimum representable integer. We define these two
> > bounds now.
> >    MAX =: ->: MIN =: _2 <.@^ 63
> > Note that since MAX is one less than 2^63, trying to take (2<.@^63)
> > would give us a float, and subtracting one would still leave us with a
> > floating point number, which is not actually equal to MAX since (>:MAX)
> > is representable as a float, while nearby integers are not. MIN on the
> > other hand is safely computed as an exponent. Note the negative base,
> > which works because 63 is odd.
> >
> > With these bounds, our problem should be easy: clamp to the integer
> > range, then use (<.).
> >    ([: <. MIN>.MAX<.]) 0.5 + __ _1e30 _1e10 _100.3
> > _9223372036854775808 _9223372036854775808 _10000000000 _100
> > So far, so good...
> >    ([: <. MIN>.MAX<.]) 0.5 + 50.4 2e10 1e30 _
> > 50 2e10 9.22337e18 9.22337e18
> > Oops. What happened?
> >    MAX <. _
> > 9.22337e18
> > Since one of the arguments is a float, (<.) casts both to floats, and
> > takes the minimum. But the closest floating-point number to MAX is
> > (MAX+1), and that number's floor (MAX+1) isn't representable as an
> > integer--it's one too big. We didn't have this problem with MIN, since
> > it is exactly a negative power of two.
> >    <. MAX+1
> > 9.22337e18
> >
> > We'll make a test case that contains numbers close to both bounds. I've
> > included the addition of 0.5 in t so I can focus on the floor function
> > from now on. The type of our result is a float, so we failed.
> >    t =. 0.5 + __,_,~ (MIN,0,MAX) +/(,@:) i:1e5
> >    3!:0 ([: <. MIN>.MAX<.]) t
> > 8
> >
> > We can fix the problem by using exact integers, but it's extremely slow.
> > However, it serves as a good answer key. The ("0) is there for a
> > reason--otherwise the big array of exact integers tends to flood RAM.
> >    fl_e =: (MIN>.MAX<.<.)&.:x:"0  NB. exact floor
> >    3!:0 key =. fl_e t
> > 4
> >    10 (6!:2) 'fl_e t'
> > 3.62018
> >
> > If we use a number small enough that its floating-point representation
> > is equal to a 64-bit integer, then we can force our answer to be a
> > float, but it's not correct since the results are sometimes too small.
> > If that doesn't matter and speed is critical, this is the right method
> > to use.
> >    MAX1 =. MAX - 512
> >    fl_f =: [: <. MIN>.MAX1<.]  NB. fast floor
> >    3!:0 fl_f t
> > 4
> >    key -: fl_f t
> > 0
> >    10 (6!:2) 'fl_f t'
> > 0.0179205
> >
> > Finally, my solution. It's not particularly elegant, but it is correct
> > and has good performance. We reduce all the values larger than MAX to
> > zero, then clamp on the minimum side and take the floor. For the values
> > that we removed, we add MAX back in. The comparison (<:&MAX) is only
> > computed once to save a little time.
> >    fl =: ((MAX*-.@]) + [: <. MIN>.*) <:&MAX
> >    key -: fl t
> > 1
> >    10 (6!:2) 'fl t'
> > 0.0426181
> > It's critical to use (<:) rather than (<) to test whether numbers are
> > acceptable even though it fails MAX, which wouldn't break (<.). That's
> > because comparisons cast their arguments to floats before comparing, so
> >    MAX < MAX+1
> > 0
> >
> > Maybe there's a quicker solution to be found. The following rounds
> > towards zero quickly by negating all the positive numbers, and restoring
> > their signs later. However, adding in the cases to make it equal to (<.)
> > on small numbers removes its advantage.
> >    fl_o =: (] * MIN <.@:>. *) -@:*  NB. floor towards zero
> >    fl_o _4.6 _3 _2.8 _1.2 3.4 5.8 9
> > _5 _3 _3 _2 4 6 9
> >    10 (6!:2) 'fl_o t'
> > 0.0324293
> >
> > Any takers?
> >
> > Marshall
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>



-- 

Devon McCormick, CFA

Quantitative Consultant
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Round to nearest integer: harder than it seems

Reply via email to