Hallo,

I'm working with avro as the serialization framework for my hadoop map-reduce 
jobs, and am emitting GenericRecord/null as my K/V values from my mapper 
classes. Having looked at the code, I see that the "key" objects (i.e. my 
records) are only recognised as being discrete by my reducer if it sees that 
the .equals() method called on the record shows a distinction. However, if the 
schema is the same (which it is for most of my mappers), then .equals() calls 
.compare(), which in turn depends on the ORDER attributes set on the fields. 
This means that if I have no sorting defined in my schema, that all records are 
treated as being equal to one another. Have I understood this correctly, and if 
so, is that not a violation of the equals contract? (for one thing, it would 
mean GenericRecord objects will often cause confusion when used with maps and 
other containers).


Regards,

Andrew

Reply via email to