findepi commented on pull request #2891:
URL: https://github.com/apache/iceberg/pull/2891#issuecomment-890738961


   
   
   > It's also reasonable for Java to not distinguish between -NaN and NaN 
values and to only produce positive NaNs. 
   
   It would be more accurate to say that java doesn't distinguish NaN values 
with sign bit set and those without.
   In Java, you can construct a NaN value with a bit sign set, e.g. 
`Double.longBitsToDouble(0xfff8000000000000L)`.
   
   If your intention is to expect Java writers to sort these as before 
everything, we still need to update the spec (and Java writers).
   The spec points at Java sorting as the reference implementation.
   
   If your intention is to expect Java writers not to write NaN values with 
sign bit set, this is in fact the canonicalization that you wanted to avoid.
   
   
   As was observed, signalling NaNs are not portable between processors, so we 
should exclude them from the picture.
   While NaN values can carry an additional payload (fraction bits, the sign), 
such use is not portable as well.
   There might be legitimate use-cases for these in some applications, but it 
seems we cannot expect them to be respected by the engines, which do not 
distinguish them.
   
   Thus, if e.g. Rust-based application writes -NaN values and Java application 
rewrites the data file (update, compaction, format change), we should expect 
the NaN attributes to be lost.
   That can be confusing to users and users would consider this a bug, which we 
probably wouldn't be able to fix.
   
   It seems it would be safer to have just one NaN concept in the Iceberg spec 
and treat all NaN values as indistinguishable.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to