[Jprogramming] Round to nearest integer: harder than it seems

Marshall Lochbaum Thu, 14 Jan 2016 11:56:36 -0800

Here's something I spent far too long on, and consequently thought was
worth sharing. I can turn it into an essay on the J wiki if people want
that.


Recently I ran into the problem of rounding a J floating point number to
an integer, and forcing the result to have integer type. This seems like
a simple task: using a standard rounding function, we have
   0.5 <.@:+ 2.3 5.1 7.6 3.9
2 5 8 4
But with numbers that are too large, the result still contains
floating-point numbers, and has type 8 (floating point) rather than 4
(integer).
   0.5 <.@:+ _1e50 2e30 _
_1e50 2e30 _
   3!:0 ]0.5 <.@:+ _1e50 2e30 _
8
When applied to a float, (<.) applies the C floor function, which yields
another float, and than casts the results to integers if all of them are
exactly representable as integers. They're not here, so they are left as
floating-point numbers.

To give a more accurate problem statement, I want the 64-bit signed
integer which is closest to the function input. Thus numbers above the
maximum representable integer should round to that integer, and likewise
for numbers below the minimum representable integer. We define these two
bounds now.
   MAX =: ->: MIN =: _2 <.@^ 63
Note that since MAX is one less than 2^63, trying to take (2<.@^63)
would give us a float, and subtracting one would still leave us with a
floating point number, which is not actually equal to MAX since (>:MAX)
is representable as a float, while nearby integers are not. MIN on the
other hand is safely computed as an exponent. Note the negative base,
which works because 63 is odd.

With these bounds, our problem should be easy: clamp to the integer
range, then use (<.).
   ([: <. MIN>.MAX<.]) 0.5 + __ _1e30 _1e10 _100.3
_9223372036854775808 _9223372036854775808 _10000000000 _100
So far, so good...
   ([: <. MIN>.MAX<.]) 0.5 + 50.4 2e10 1e30 _
50 2e10 9.22337e18 9.22337e18
Oops. What happened?
   MAX <. _
9.22337e18
Since one of the arguments is a float, (<.) casts both to floats, and
takes the minimum. But the closest floating-point number to MAX is
(MAX+1), and that number's floor (MAX+1) isn't representable as an
integer--it's one too big. We didn't have this problem with MIN, since
it is exactly a negative power of two.
   <. MAX+1
9.22337e18

We'll make a test case that contains numbers close to both bounds. I've
included the addition of 0.5 in t so I can focus on the floor function
from now on. The type of our result is a float, so we failed.
   t =. 0.5 + __,_,~ (MIN,0,MAX) +/(,@:) i:1e5
   3!:0 ([: <. MIN>.MAX<.]) t
8

We can fix the problem by using exact integers, but it's extremely slow.
However, it serves as a good answer key. The ("0) is there for a
reason--otherwise the big array of exact integers tends to flood RAM.
   fl_e =: (MIN>.MAX<.<.)&.:x:"0  NB. exact floor
   3!:0 key =. fl_e t
4
   10 (6!:2) 'fl_e t'
3.62018

If we use a number small enough that its floating-point representation
is equal to a 64-bit integer, then we can force our answer to be a
float, but it's not correct since the results are sometimes too small.
If that doesn't matter and speed is critical, this is the right method
to use.
   MAX1 =. MAX - 512
   fl_f =: [: <. MIN>.MAX1<.]  NB. fast floor
   3!:0 fl_f t
4
   key -: fl_f t
0
   10 (6!:2) 'fl_f t'
0.0179205

Finally, my solution. It's not particularly elegant, but it is correct
and has good performance. We reduce all the values larger than MAX to
zero, then clamp on the minimum side and take the floor. For the values
that we removed, we add MAX back in. The comparison (<:&MAX) is only
computed once to save a little time.
   fl =: ((MAX*-.@]) + [: <. MIN>.*) <:&MAX
   key -: fl t
1
   10 (6!:2) 'fl t'
0.0426181
It's critical to use (<:) rather than (<) to test whether numbers are
acceptable even though it fails MAX, which wouldn't break (<.). That's
because comparisons cast their arguments to floats before comparing, so
   MAX < MAX+1
0

Maybe there's a quicker solution to be found. The following rounds
towards zero quickly by negating all the positive numbers, and restoring
their signs later. However, adding in the cases to make it equal to (<.)
on small numbers removes its advantage.
   fl_o =: (] * MIN <.@:>. *) -@:*  NB. floor towards zero
   fl_o _4.6 _3 _2.8 _1.2 3.4 5.8 9
_5 _3 _3 _2 4 6 9
   10 (6!:2) 'fl_o t'
0.0324293

Any takers?

Marshall
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

[Jprogramming] Round to nearest integer: harder than it seems

Reply via email to