[GitHub] spark pull request #20362: [Spark-22886][ML][TESTS] ML test for structured s...

smurakozi Fri, 02 Feb 2018 12:01:23 -0800

Github user smurakozi commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20362#discussion_r165745445
  
    --- Diff: 
mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala ---
    @@ -599,8 +598,11 @@ class ALSSuite
               (ex, act) =>
                 ex.userFactors.first().getSeq[Float](1) === 
act.userFactors.first.getSeq[Float](1)
             } { (ex, act, _) =>
    -          ex.transform(_: 
DataFrame).select("prediction").first.getDouble(0) ~==
    -            act.transform(_: 
DataFrame).select("prediction").first.getDouble(0) absTol 1e-6
    +          testTransformerByGlobalCheckFunc[Float](_: DataFrame, act, 
"prediction") {
    +            case actRows: Seq[Row] =>
    +              ex.transform(_: 
DataFrame).select("prediction").first.getDouble(0) ~==
    +                actRows(0).getDouble(0) absTol 1e-6
    +          }
    --- End diff --
    
    I think this code does not check anything. 
    `testTransformerByGlobalCheckFunc[Float](_: DataFrame, act, "prediction")` 
is just a partial application of `testTransformerByGlobalCheckFunc`. 
    However,  `checkNumericTypesALS` expects `check2: (ALSModel, ALSModel, 
DataFrame) => Unit`. It's happy to call the provided function, discard the 
partially applied function and use `()` instead, so it will typecheck.
    As a consequence, the function doing the assert is never called, so the 
`~===` assertion never happens. You can check it say by asking for the 100th 
column of the first row - it will not produce an error.
    
    This problem is not a result of your change, the original code had the same 
issue.
    
    It could probably be simplified a bit but I think the original intent was 
to do a check like this:
    
    ```
    { (ex, act, df) =>
              ex.transform(df).selectExpr("cast(prediction as 
double)").first.getDouble(0) ~==
                act.transform(df).selectExpr("cast(prediction as 
double)").first.getDouble(0) absTol
                  1e-6
            }
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20362: [Spark-22886][ML][TESTS] ML test for structured s...

Reply via email to