ni...@lysator.liu.se (Niels Möller) writes: Ah, and if we're brain storming, yet another variant would be to compute an extra limb (or maybe only half a limb) of precision when we compute the 2-adic k'th root. If the extra limb is non-zero, then the input can't be a k'th power. That's going to asymptotically a bit slower than the O(1) logarithm trick, but it can have a much smaller false positive rate. That's a good idea, and easy to implement correctly as well. If I understand the current code correctly, it goes to lengths at computing not one extra bit. Making it more limb-oriented is probably a good idea irrespective of your suggestion.
I am now thinking of the usages and typical operands of these usages. When asking "is N a perfect power?" I assume it will be common that N is not actually a perfect power. We should optimise for that. When computing mpz_rootexact(R,N,k) for a specific k, the user claims that she knows N^{1/k} is an integer. We should optimise for that, meaning that we should not compute a b-adic root for some small b first. When computing some function mpz_is_this_a_kth_root(R,N,k) for a specific k, then I would again assume N is not usually a kth root. It is probably not uncommon that we want a root for perfect powers, as Marco suggested. Then it will be faster to compute it as a side effect of checking its existence. (I.e., mpz_perfect_power_p + mpz_rootexact will be slower, but so will mpz_perfect_power_p_and_get_me_k + mpz_rootexact...) Any more scenarios? Whatever interface we choose, we should make sure we can implicitly or explicitly determine the likelihood for N being a perfect power. For computing mpz_rootexact(R,N,k) the optimal strategy is most likely to compute the upper half using Euclidean norm, and the lower part using b-adic norm. We don't have Euclidean code for nth root that is O(M(log(N^{1/k}))) = O(M(log(N)/k)) on average for arbitrary N. It should be mucg easier to write such code for perfect k powers N than for an arbitrary N. Like with Jebelean's divexact trick, this should use a few bits of "overlap" in order to glue the upper and lower halves together. (Why would this be faster? The computations are both superlinear, 2 half as large computations must then be (asymptotically) faster than one full size computation.) -- Torbjörn _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel