I'm fine with implementing compareTo. With proper type annotations, we could provide a way to define that a given map be instantiated as a TreeMap instead of a HashMap when deserializing, which would enforce an order.
--David Bryan Duxbury wrote: > We use Thrift structures within Hadoop Map/Reduce. Occasionally, a > Thrift object will be our grouping or join key. Usually, this works > great, but occasionally, there are some issues. In particular, we > have trouble with maps and sets. The problem is that the ordering of > the map/set internally is arbitrary, and we serialize in that > arbitrary order. The result is that two 'equal' objects might not > serialize into the same byte array, and therefore fail equality > checks based only on the serialized data. > > I was wondering if it would make sense to enforce some sort of > ordering scheme for collections where order might be arbitrary, at > least during serialization. This would necessitate implementing a > decent compareTo on generated Thrift structs so we could sort before > writing, and obviously, it would include sorting overhead. > > Are other people interested in making this use case work acceptably? > > -Bryan >