On 02/01/2012 05:58 PM, John Boyle wrote:
I happened to observe this commit from today by Neil Toronto:
http://git.racket-lang.org/plt/commitdiff/47fcdd4916a2d33ee5c28eb833397ce1d2a515e2
I may have some useful input on this, having dealt with similar problems
myself.
The problem: Given b and x, you want to find the largest integer n such
that b^n <= x (or, for the ceiling problem, the smallest integer n such
that b^n ≥ x). This can profitably be viewed as a case of a more
general problem: given a monotonically increasing but otherwise opaque
function f, find the largest integer n such that f(n) ≤ x. The previous
algorithm to solve this can now be viewed as a linear search, taking n
steps. This took too long, and Neil Toronto replaced it with an
algorithm that depended on floating-point numbers; but I hate to see
that happen, because floats will screw up when the numbers get too
large; careful reasoning may prove that floats will be sufficiently
accurate for a given application, but it would be nice to have an exact
algorithm so that one didn't have to do that.
I come here to present such an algorithm. Simply put: Use a binary
search rather than a linear search. How do we do binary search on "the
nonnegative integers"? Well, start with 0 and 1 as your bounds, and
keep doubling the 1 until you get something that's actually too high;
then you have a real lower and upper bound. This will take log(n) steps
to find the upper bound, and then a further log(n) steps to tighten the
bounds to a single integer. Thus, this takes roughly 2*log(n) steps,
where each step involves calculating (expt b n) plus a comparison. It
might be possible to do slightly better by not treating b^n as a black
box (e.g. by using old results of b^n to compute new ones rather than
calling "expt" from scratch, or by using "integer-length" plus some
arithmetic to get some good initial bounds), but I suspect this is good
enough and further complexity is not worth it.
Note that this algorithm should work perfectly with any nonnegative
rational arguments [other than 0 for b or x, and 1 for b], and should
give as good an answer as any with floating-point arguments. Also,
"integer-finverse", as I called it, might be useful for many other
"floor"-type computations with exact numbers (I have so used it myself
in an ad hoc Arc math library).
;Finds n such that n >= 0 and f(n) <= x < f(n+1) in about 2*log(n) steps.
;Assumes f is monotonically increasing.
(define (integer-finverse f x)
(define upper-bound
(let loop ((n 1))
(if (> (f n) x)
n
(loop (* n 2)))))
(define (search a b) ;b is too big, a is not too big
(let ((d (- b a)))
(if (= d 1)
a
(let ((m (+ a (quotient d 2))))
(if (> (f m) x)
(search a m)
(search m b))))))
(search (quotient upper-bound 2) upper-bound))
(define (floor-log/base b x)
(cond ((< b 1) (- (ceiling-log/base (/ b) x)))
((= b 1) (error "Base shouldn't be 1."))
((< x 1) (- (ceiling-log/base b (/ x))))
(else (integer-finverse (lambda (n) (expt b n)) x))))
(define (ceiling-log/base b x)
(cond ((< b 1) (- (floor-log/base (/ b) x)))
((= b 1) (error "Base shouldn't be 1."))
((< x 1) (- (floor-log/base b (/ x))))
(else
(let ((u (floor-log/base b x)))
(if (= x (expt b u))
u
(+ u 1))))))
Testing:
> (floor-log/base 10 3)
0
> (floor-log/base 3 10)
2
> (ceiling-log/base 3 10)
3
> (ceiling-log/base 1/3 10)
-2
> (floor-log/base 1/3 10)
-3
> (floor-log/base 2/3 10)
-6
> (ceiling-log/base 2/3 10)
-5
> (exact->inexact (expt 3/2 6))
11.390625
> (exact->inexact (expt 3/2 5))
7.59375
Neet! I'll have to do some timing tests.
The functions are used to format plot labels for numbers outside
floating-point range. As it is, the iterations will fix any under- or
overestimate. Floating-point math is no problem: I just use it to get an
initial estimate.
Also, I was surprised to find that the `log' function works on numbers
well outside floating-point range. (Big ones, not small ones, which is
why I subtract the logs of the numerator and denominator.) It works by
taking successive integer square roots to reduce the argument first. If
it stops working at some number size, that number is probably too large
to fit in memory.
But for some large numbers that aren't too huge - say, megabyte-sized -
reducing the argument takes a *long time*. I'll be interested to try
switching to your solution for that case to see if it works out better.
Also, I'ma keep this code around for other stuff.
I might add, by the way, that I'm inclined to expect the base to be a
second, optional argument (perhaps defaulting to 10) to a function
called simply "floor-log" or "ceiling-log". I don't know if that fits
with desired conventions, though.
The most consistent default is the natural base, but the discontinuity
of floor/ceiling makes using transcendental bases pretty much
impossible. :/ Next best would be 2 or 10, but I couldn't decide between
them.
Neil ⊥
_________________________
Racket Developers list:
http://lists.racket-lang.org/dev