Sean's PR may be relevant to this issue (https://github.com/apache/spark/pull/3702). As a workaround, you can try to truncate the raw scores to 4 digits (e.g., 0.5643215 -> 0.5643) before sending it to BinaryClassificationMetrics. This may not work well if he score distribution is very skewed. See discussion on https://issues.apache.org/jira/browse/SPARK-4547 -Xiangrui
On Tue, Dec 23, 2014 at 9:00 AM, Thomas Kwan <thomas.k...@manage.com> wrote: > Hi there, > > We are using mllib 1.1.1, and doing Logistics Regression with a dataset of > about 150M rows. > The training part usually goes pretty smoothly without any retries. But > during the prediction stage and BinaryClassificationMetrics stage, I am > seeing retries with error of "fetch failure". > > The prediction part is just as follows: > > val predictionAndLabel = testRDD.map { point => > val prediction = model.predict(point.features) > (prediction, point.label) > } > ... > val metrics = new BinaryClassificationMetrics(predictionAndLabel) > > The fetch failure happened with the following stack trace: > > org.apache.spark.rdd.PairRDDFunctions.combineByKey(PairRDDFunctions.scala:515) > > org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.x$3$lzycompute(BinaryClassificationMetrics.scala:101) > > org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.x$3(BinaryClassificationMetrics.scala:96) > > org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.confusions$lzycompute(BinaryClassificationMetrics.scala:98) > > org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.confusions(BinaryClassificationMetrics.scala:98) > > org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.createCurve(BinaryClassificationMetrics.scala:142) > > org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.roc(BinaryClassificationMetrics.scala:50) > > org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.areaUnderROC(BinaryClassificationMetrics.scala:60) > > com.manage.ml.evaluation.BinaryClassificationMetrics.areaUnderROC(BinaryClassificationMetrics.scala:14) > > ... > > > We are doing this in the yarn-client mode. 32 executors, 16G executor > memory, and 12 cores as the spark-submit settings. > > I wonder if anyone has suggestion on how to debug this. > > thanks in advance > thomas --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org