The documentation needs to be updated to state that higher metric
values are better (https://issues.apache.org/jira/browse/SPARK-7740).
I don't know why if you negate the return value of the Evaluator you
still get the highest regularization parameter candidate. Maybe you
should check the log messages from CrossValidator and see the average
metric values during cross validation. -Xiangrui

On Sat, May 9, 2015 at 12:15 PM, Stefan H. <twel...@gmx.de> wrote:
> Hello everyone,
>
> I am stuck with the (experimental, I think) API for machine learning
> pipelines. I have a pipeline with just one estimator (ALS) and I want it to
> try different values for the regularization parameter. Therefore I need to
> supply an Evaluator that returns a value of type Double. I guess this could
> be something like accuracy or mean squared error? The only implementation I
> found is BinaryClassificationEvaluator, and I did not understand the
> computation there.
>
> I could not find detailed documentation so I implemented a dummy Evaluator
> that just returns the regularization parameter:
>
>   new Evaluator {
>     def evaluate(dataset: DataFrame, paramMap: ParamMap): Double =
>       paramMap.get(als.regParam).getOrElse(throw new Exception)
>   }
>
> I just wanted to see whether the lower or higher value "wins". On the
> resulting model I inspected the chosen regularization parameter this way:
>
>   cvModel.bestModel.fittingParamMap.get(als.regParam)
>
> And it was the highest of my three regularization parameter candidates.
> Strange thing is, if I negate the return value of the Evaluator, that line
> still returns the highest regularization parameter candidate.
>
> So I am probably working with false assumptions. I'd be grateful if someone
> could point me to some documentation or examples, or has a few hints to
> share.
>
> Cheers,
> Stefan
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-implement-an-Evaluator-for-a-ML-pipeline-tp22830.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to