This is addressed in https://issues.apache.org/jira/browse/SPARK-4789.
In the new pipeline API, we can simply output two columns, one for the
best predicted class, and the other for probabilities or confidence
scores for each class. -Xiangrui

On Tue, Jan 6, 2015 at 11:43 AM, Jianguo Li <flyingfromch...@gmail.com> wrote:
> Hi,
>
> A while ago, somebody asked about getting a confidence value of a prediction
> with MLlib's implementation of Naive Bayes's classification.
>
> I was wondering if there is any plan in the near future for the predict
> function to return both a label and a confidence/probability? Or could the
> private variables in the various machine learning models be exposed so we
> could write our own functions which return both?
>
> Having a confidence/probability could be very useful in real application.
> For one thing, you can choose to trust the predicted label only if it has a
> high confidence level. Also, if you want to combine the results from
> multiple classifiers, the confidence/probability could be used as some kind
> of weight for combining.
>
> Thanks,
>
> Jianguo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to