Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23065#discussion_r234393799
  
    --- Diff: 
mllib/src/test/scala/org/apache/spark/ml/feature/QuantileDiscretizerSuite.scala 
---
    @@ -276,10 +276,10 @@ class QuantileDiscretizerSuite extends MLTest with 
DefaultReadWriteTest {
           1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
         val data2 = Array.range(1, 40, 2).map(_.toDouble)
         val expected2 = Array (0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 
2.0,
    -      2.0, 2.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0)
    +      2.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0)
    --- End diff --
    
    Interestingly, avoiding double Ranges actually fixed the code here. You can 
see the bucketing before didn't quite make sense. Now it's even. It's because 
of...
    
    ```
    scala> (0.0 to 1.0 by 1.0 / 10).toList
    <console>:12: warning: method to in trait FractionalProxy is deprecated 
(since 2.12.6): use BigDecimal range instead
           (0.0 to 1.0 by 1.0 / 10).toList
                ^
    res5: List[Double] = List(0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 
0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999)
    
    scala> (0 to 10).map(_.toDouble / 10).toList
    res6: List[Double] = List(0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 
1.0)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to