Github user smurakozi commented on a diff in the pull request: https://github.com/apache/spark/pull/20362#discussion_r165745445 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -599,8 +598,11 @@ class ALSSuite (ex, act) => ex.userFactors.first().getSeq[Float](1) === act.userFactors.first.getSeq[Float](1) } { (ex, act, _) => - ex.transform(_: DataFrame).select("prediction").first.getDouble(0) ~== - act.transform(_: DataFrame).select("prediction").first.getDouble(0) absTol 1e-6 + testTransformerByGlobalCheckFunc[Float](_: DataFrame, act, "prediction") { + case actRows: Seq[Row] => + ex.transform(_: DataFrame).select("prediction").first.getDouble(0) ~== + actRows(0).getDouble(0) absTol 1e-6 + } --- End diff -- I think this code does not check anything. `testTransformerByGlobalCheckFunc[Float](_: DataFrame, act, "prediction")` is just a partial application of `testTransformerByGlobalCheckFunc`. However, `checkNumericTypesALS` expects `check2: (ALSModel, ALSModel, DataFrame) => Unit`. It's happy to call the provided function, discard the partially applied function and use `()` instead, so it will typecheck. As a consequence, the function doing the assert is never called, so the `~===` assertion never happens. You can check it say by asking for the 100th column of the first row - it will not produce an error. This problem is not a result of your change, the original code had the same issue. It could probably be simplified a bit but I think the original intent was to do a check like this: ``` { (ex, act, df) => ex.transform(df).selectExpr("cast(prediction as double)").first.getDouble(0) ~== act.transform(df).selectExpr("cast(prediction as double)").first.getDouble(0) absTol 1e-6 } ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org