Dongjoon Hyun created SPARK-26950:
-------------------------------------

             Summary: Make RandomDataGenerator use Float.NaN or Double.NaN for 
all NaN values
                 Key: SPARK-26950
                 URL: https://issues.apache.org/jira/browse/SPARK-26950
             Project: Spark
          Issue Type: Bug
          Components: SQL, Tests
    Affects Versions: 2.3.4, 2.4.2, 3.0.0
            Reporter: Dongjoon Hyun


Apache Spark uses the predefined `Float.NaN` and `Double.NaN` for NaN values, 
but there exists more NaN values with different binary presentations.

{code}
scala> java.nio.ByteBuffer.allocate(4).putFloat(Float.NaN).array
res1: Array[Byte] = Array(127, -64, 0, 0)

scala> val x = java.lang.Float.intBitsToFloat(-6966608)
x: Float = NaN

scala> java.nio.ByteBuffer.allocate(4).putFloat(x).array
res2: Array[Byte] = Array(-1, -107, -78, -80)
{code}

`RandomDataGenerator` generates these NaN values. It's good, but it causes 
`checkEvaluationWithUnsafeProjection` failures due to the difference between 
`UnsafeRow` binary presentation. The following is the UT failure instance. This 
issue aims to fix this flakiness.

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/102528/testReport/

{code}
Failed
org.apache.spark.sql.avro.AvroCatalystDataConversionSuite.flat schema 
struct<col_0:decimal(16,11),col_1:float,col_2:decimal(38,0),col_3:decimal(38,0),col_4:string>
 with seed -81044812370056695
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to