Hello all,
I've been playing around trying to implement the Cantor tuple function (see
attached code), which is essentially a way to map a sequence x1, x2, ..., xN of
non-negative integers to a single unique number. See:
https://en.wikipedia.org/wiki/Pairing_function#Cantor_pairing_function
The inspiration came from a need to check pairs of numbers for uniqueness.
The current implementation is a bit stupid and naive (it falls into the same
trap as Andrei identifies in TDPL for the naive recursive Fibonacci function),
but that's a deliberate short-term choice just to get something working-ish.
The challenge here is that as the sequence length increases, we very quickly get
to the point where the number calculated will overflow the bounds of any inbuilt
integer type. Indeed, even where you're just dealing with a pair of numbers,
you risk overflow if the values are close to the maximum of their given type.
For this reason I've made use of BigInt internally and as a return value.
However, it occurs to me that it would be nice to avoid BigInt unless it's
necessary, both for speed/memory reasons and for convenience. I'm wondering if
anyone has any advice on how to do this safely, i.e. to definitively check that
whether overflow is going to happen and if it is, to make use of a larger-scale
type.
I'd also like to be able to implement the cantorPair function so that it can
take any integer input _including_ BigInt; however, implementing a templated
version runs into the difficulty of clashing with the BigInt version. One
approach I tried involved a private function taking only BigInt input, with a
templated version that takes arbitrary input types and converts them. This ran
into trouble since I could not find a ready way to check if an input was itself
a BigInt. So again, any advice here is welcome. :-)
Incidentally, a few notes on BigInt:
-- it would be nice if there was a toULong as well as a toLong function
-- the following fails:
BigInt x = BigInt(y);
... if y is itself a BigInt. Worth adding to BigInt's constructor to
correct this?
Advice, thoughts, etc., welcome. This seemed like something that could be
useful for Phobos if cleaned up and well implemented .... ?
import std.bigint, std.conv, std.exception, std.math, std.range, std.stdio, std.traits;
BigInt cantorPair(BigInt x, BigInt y)
{
enforce(x >= 0);
enforce(y >= 0);
return ((x + y)*(x + y + 1))/2 + y;
}
BigInt cantorTuple(R)(R r)
if(isInputRange!R && hasLength!R)
{
enforce(r.length > 0);
if(r.length == 1)
{
enforce(r[$-1] >= 0);
return BigInt(r[$-1]);
}
else
return cantorPair(cantorTuple(r[0 .. $-1]), BigInt(r[$-1]));
}
void main()
{
foreach(n; iota(1, 10))
{
long[] x;
x.length = n;
foreach(i, ref val; x)
val = i;
BigInt c = cantorTuple(x);
writeln(n, "\t", c);
}
writeln();
writeln(cantorTuple([ulong.max, ulong.max]));
writeln(cantorPair(BigInt(ulong.max), BigInt(ulong.max)));
}