Re: UUID.compareTo broken?

Peter Levart Thu, 10 Apr 2014 23:56:27 -0700


On 04/10/2014 08:21 PM, Steven Schlansker wrote:

On Apr 9, 2014, at 2:21 AM, Paul Sandoz <[email protected]> wrote:

On Apr 8, 2014, at 9:15 PM, Mike Duigou <[email protected]> wrote:

That seems a terribly broken usage of UUID for 128 bit numbers or a pair of 
signed 64 bit numbers :-)

Part of me thinks we should not be supporting such broken usage. Might be worth 
getting some usage data from grepcode or maven central.

I’m guilty of doing this at a point in the past.  We used it to intermix 
multiple sources of data - a few that used ‘long’ IDs and a few that used 
‘real’ UUIDs.  We took prefixes that had flags never generated by UUID 
libraries (the “reserved for compatibility with Microsoft” ones IIRC) and 
slapped those with a long to make pseudo-UUIDs.  That way everything was a UUID 
and we were guaranteed to never see collisions within our dataset.

We never expected them to give sensible meanings to the various getters e.g. 
timestamp() but we did expect the UUID class to work generally.
We never relied on any particular ordering.

We could provide static methods to return appropriate comparators for version 1 
and version 2 UUIDs--I've actually written them before for other projects.

It would be nice to just have one compareTo that does the right thing based of 
the UUID types being compared.

If it were up to me only the time and DCE UUIDs would be comparable, there's no 
ordering relationship for other versions.

I think it is fine for random UUIDs to be comparable with each other.

The comparators I've considered adding would only allow comparisons within the 
same version.

Yes, although for a general comparator the primary sort key could be the 
version value.

+1 to a “sort first by version then version-specific ordering” — it gives you 
the best of both worlds IMO.

I think the natural ordering for UUIDs must be able to create an ordering over 
all possible UUIDs, no matter the version or even if it is valid or not.  If 
you read UUIDs from an external source you have no way to understand what 
version they are or aren’t.  Imagine a process that loads data from a database 
or text file into a TreeMap - it would be awful if a change in the UUID 
generation scheme on the far side caused the Comparator you used to no longer 
function.

Hi,

Code that relies on UUIDs to have a "natural" order, say"chronological", is relying on being given the particular type of UUIDsthat have the time built-in. When given mixed-type or non-time-basedUUIDs, such code will break. The purpose of UUID schemes is generatingglobally unique identifiers, not interpreting them. Various types existjust because it's practical to generate UUIDs differently in differentcontexts. Programs should not try to extract business information fromUUIDs (except probably for hunting down virus authors:http://en.wikipedia.org/wiki/Melissa_%28computer_virus%29). So I think agood general-purpose compareTo() method should try to discourage suchusage for example by reversing the bits of the UUID value beforecomparing. This would also make algorithms that order UUIDs forfacilitating quick access (B-Tree, Red-Black-Tree, ...) happier.


Try the following benchmark (with options: -wi 5 -i 5 -t 1 -f 1):

public class TreeMapInsertionBench {

    private static final int N = 100000;

    @GenerateMicroBenchmark()
    @BenchmarkMode(Mode.AverageTime)
    public static void testTreeMapSequentialInsert(BlackHole bh) {
        Map<Integer, Boolean> map = new TreeMap<>();
        for (int i = 0; i < N; i++) {
            map.put(new Integer(i), Boolean.TRUE);
        }
        bh.consume(map);
    }

    @GenerateMicroBenchmark()
    @BenchmarkMode(Mode.AverageTime)
    public static void testTreeMapReversedBitsInsert(BlackHole bh) {
        Map<Integer, Boolean> map = new TreeMap<>();
        for (int i = 0; i < N; i++) {
            map.put(new Integer(Integer.reverse(i)), Boolean.TRUE);
        }
        bh.consume(map);
    }
}

...and the results will be:

Benchmark Mode   Samples         Mean   Mean error    Units

o.s.TreeMapInsertionBench.testTreeMapReversedBitsInsert avgt5 8.208 0.100 ms/opo.s.TreeMapInsertionBench.testTreeMapSequentialInsert avgt5 10.498 1.164 ms/op



That's just an opinion.

Regards, Peter

Re: UUID.compareTo broken?

Reply via email to