[ 
https://issues.apache.org/jira/browse/SPARK-4736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233906#comment-14233906
 ] 

Apache Spark commented on SPARK-4736:
-------------------------------------

User 'dikejiang' has created a pull request for this issue:
https://github.com/apache/spark/pull/3583

> functions returning the category with weights
> ---------------------------------------------
>
>                 Key: SPARK-4736
>                 URL: https://issues.apache.org/jira/browse/SPARK-4736
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: yu jiang
>
> In this version, we add two functions: 1) predictByVotingWithWeight(features: 
> Vector) and 2) predictWithWeight(features: Vector). And we also modify the 
> function: predictByVoting(features: Vector). There are at least two reasons 
> why we make such improvement: 1) In our practice, we want to find the top N 
> samples from one category. However in 1.3.0 version, the function of predict 
> can only give the predicted category but without weights. 2) What's more, in 
> our practice, the numbers of positive and negative samples are very 
> unbalance. There are much less positive samples than negative samples. 
> According to the results of votes, there are very few samples predicted as 
> positive sample. If the weights are also given, users can make a proper 
> threshold to modify the results so that the performance can be improved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to