Cantor tuple function, BigInt, and integer overflow

Joseph Rushton Wakeling Thu, 20 Dec 2012 10:47:00 -0800

Hello all,

I've been playing around trying to implement the Cantor tuple function (seeattached code), which is essentially a way to map a sequence x1, x2, ..., xN ofnon-negative integers to a single unique number. See:

https://en.wikipedia.org/wiki/Pairing_function#Cantor_pairing_function


The inspiration came from a need to check pairs of numbers for uniqueness.

The current implementation is a bit stupid and naive (it falls into the sametrap as Andrei identifies in TDPL for the naive recursive Fibonacci function),but that's a deliberate short-term choice just to get something working-ish.

The challenge here is that as the sequence length increases, we very quickly getto the point where the number calculated will overflow the bounds of any inbuiltinteger type. Indeed, even where you're just dealing with a pair of numbers,you risk overflow if the values are close to the maximum of their given type.For this reason I've made use of BigInt internally and as a return value.

However, it occurs to me that it would be nice to avoid BigInt unless it'snecessary, both for speed/memory reasons and for convenience. I'm wondering ifanyone has any advice on how to do this safely, i.e. to definitively check thatwhether overflow is going to happen and if it is, to make use of a larger-scaletype.

I'd also like to be able to implement the cantorPair function so that it cantake any integer input _including_ BigInt; however, implementing a templatedversion runs into the difficulty of clashing with the BigInt version. Oneapproach I tried involved a private function taking only BigInt input, with atemplated version that takes arbitrary input types and converts them. This raninto trouble since I could not find a ready way to check if an input was itselfa BigInt. So again, any advice here is welcome. :-)


Incidentally, a few notes on BigInt:

    -- it would be nice if there was a toULong as well as a toLong function

    -- the following fails:

               BigInt x = BigInt(y);

       ... if y is itself a BigInt.  Worth adding to BigInt's constructor to
       correct this?

Advice, thoughts, etc., welcome. This seemed like something that could beuseful for Phobos if cleaned up and well implemented .... ?

import std.bigint, std.conv, std.exception, std.math, std.range, std.stdio, std.traits;

BigInt cantorPair(BigInt x, BigInt y)
{
	enforce(x >= 0);
	enforce(y >= 0);
	return ((x + y)*(x + y + 1))/2 + y;
}

BigInt cantorTuple(R)(R r)
	if(isInputRange!R && hasLength!R)
{
	enforce(r.length > 0);
	if(r.length == 1)
	{
		enforce(r[$-1] >= 0);
		return BigInt(r[$-1]);
	}
	else
		return cantorPair(cantorTuple(r[0 .. $-1]), BigInt(r[$-1]));
}


void main()
{
	foreach(n; iota(1, 10))
	{
		long[] x;
		x.length = n;
		foreach(i, ref val; x)
			val = i;

		BigInt c = cantorTuple(x);
		writeln(n, "\t", c);
	}
	writeln();

	writeln(cantorTuple([ulong.max, ulong.max]));
	writeln(cantorPair(BigInt(ulong.max), BigInt(ulong.max)));
}

Cantor tuple function, BigInt, and integer overflow

Reply via email to