Hello all,

I've been playing around trying to implement the Cantor tuple function (see attached code), which is essentially a way to map a sequence x1, x2, ..., xN of non-negative integers to a single unique number. See:
https://en.wikipedia.org/wiki/Pairing_function#Cantor_pairing_function

The inspiration came from a need to check pairs of numbers for uniqueness.

The current implementation is a bit stupid and naive (it falls into the same trap as Andrei identifies in TDPL for the naive recursive Fibonacci function), but that's a deliberate short-term choice just to get something working-ish.

The challenge here is that as the sequence length increases, we very quickly get to the point where the number calculated will overflow the bounds of any inbuilt integer type. Indeed, even where you're just dealing with a pair of numbers, you risk overflow if the values are close to the maximum of their given type. For this reason I've made use of BigInt internally and as a return value.

However, it occurs to me that it would be nice to avoid BigInt unless it's necessary, both for speed/memory reasons and for convenience. I'm wondering if anyone has any advice on how to do this safely, i.e. to definitively check that whether overflow is going to happen and if it is, to make use of a larger-scale type.

I'd also like to be able to implement the cantorPair function so that it can take any integer input _including_ BigInt; however, implementing a templated version runs into the difficulty of clashing with the BigInt version. One approach I tried involved a private function taking only BigInt input, with a templated version that takes arbitrary input types and converts them. This ran into trouble since I could not find a ready way to check if an input was itself a BigInt. So again, any advice here is welcome. :-)

Incidentally, a few notes on BigInt:

    -- it would be nice if there was a toULong as well as a toLong function

    -- the following fails:

               BigInt x = BigInt(y);

       ... if y is itself a BigInt.  Worth adding to BigInt's constructor to
       correct this?

Advice, thoughts, etc., welcome. This seemed like something that could be useful for Phobos if cleaned up and well implemented .... ?
import std.bigint, std.conv, std.exception, std.math, std.range, std.stdio, std.traits;

BigInt cantorPair(BigInt x, BigInt y)
{
	enforce(x >= 0);
	enforce(y >= 0);
	return ((x + y)*(x + y + 1))/2 + y;
}

BigInt cantorTuple(R)(R r)
	if(isInputRange!R && hasLength!R)
{
	enforce(r.length > 0);
	if(r.length == 1)
	{
		enforce(r[$-1] >= 0);
		return BigInt(r[$-1]);
	}
	else
		return cantorPair(cantorTuple(r[0 .. $-1]), BigInt(r[$-1]));
}


void main()
{
	foreach(n; iota(1, 10))
	{
		long[] x;
		x.length = n;
		foreach(i, ref val; x)
			val = i;

		BigInt c = cantorTuple(x);
		writeln(n, "\t", c);
	}
	writeln();

	writeln(cantorTuple([ulong.max, ulong.max]));
	writeln(cantorPair(BigInt(ulong.max), BigInt(ulong.max)));
}

Reply via email to