Putting aside for a moment the question of hashing -0 and +0, I wonder if this
could be addressed by ordering floating point numbers using the totalOrder
predicate, but when there is a NaN in a file, omit the field it is in from
manifest_entry.data_file.{sort_columns, lower_bounds, upper_bounds}.
The logic here is that, though ham-fisted, this would also prevent engines from
misinterpreting these fields. A natural follow-up question is, "should we
populate these values in some other way less likely to be misinterpreted by
compute engines?" IIRC, parquet's transition from {min,max} to
{min_value,max_value} was motivated by an ambiguity or bug in the spec. This
starts to get a bit arcane, but maybe we WANT a speed bump to stop engines from
prune or search by using the non-total-order operators like <=.
Thoughts?