findepi commented on pull request #2891: URL: https://github.com/apache/iceberg/pull/2891#issuecomment-890738961
> It's also reasonable for Java to not distinguish between -NaN and NaN values and to only produce positive NaNs. It would be more accurate to say that java doesn't distinguish NaN values with sign bit set and those without. In Java, you can construct a NaN value with a bit sign set, e.g. `Double.longBitsToDouble(0xfff8000000000000L)`. If your intention is to expect Java writers to sort these as before everything, we still need to update the spec (and Java writers). The spec points at Java sorting as the reference implementation. If your intention is to expect Java writers not to write NaN values with sign bit set, this is in fact the canonicalization that you wanted to avoid. As was observed, signalling NaNs are not portable between processors, so we should exclude them from the picture. While NaN values can carry an additional payload (fraction bits, the sign), such use is not portable as well. There might be legitimate use-cases for these in some applications, but it seems we cannot expect them to be respected by the engines, which do not distinguish them. Thus, if e.g. Rust-based application writes -NaN values and Java application rewrites the data file (update, compaction, format change), we should expect the NaN attributes to be lost. That can be confusing to users and users would consider this a bug, which we probably wouldn't be able to fix. It seems it would be safer to have just one NaN concept in the Iceberg spec and treat all NaN values as indistinguishable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
