[ 
https://issues.apache.org/jira/browse/FLINK-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15689765#comment-15689765
 ] 

ASF GitHub Bot commented on FLINK-4712:
---------------------------------------

Github user thvasilo commented on the issue:

    https://github.com/apache/flink/pull/2838
  
    Hello Gabor, 
    
    I like the idea of having a RankingScore, it seems like having that 
hierarchy with Score, RankingScore and PairWiseScore gives us the flexibility 
we need to include ranking and supervised learning evaluation under the same 
umbrella.
    
    I would also encourage sharing any other ideas you broached that might 
break the API, this is still very much an evolving project and there is no need 
to shoehorn everything into an `evaluate(test: TestType): DataSet[Double]` 
function if there are better alternatives.
    
    One think we need to consider is how this affects cross-validation and 
model selection/hyper-parameter tuning. These two aspects of the library are 
tightly linked and I think that we'll need to work on them in parallel to find 
issues that affect both.
    
    I recommend taking a look at the [cross-validation 
PR](https://github.com/apache/flink/pull/891) I had opened way back when, and 
make a new WIP PR that uses the current one (#2838) as a basis. Since the 
`Score` interface still exists it shouldn't require many changes, and all 
that's added is the CrossValidation class. There are other fundamental issues 
with the sampling there we can discuss in due time.
    
    Regarding the RankingPredictor we should consider the usecase of such an 
interface. Is it only going to be used for recommendation? If yes, what are the 
cases where we could build a Pipeline with current or future pre-processing 
steps? Could you give some pipeline examples in a recommendation setting?


> Implementing ranking predictions for ALS
> ----------------------------------------
>
>                 Key: FLINK-4712
>                 URL: https://issues.apache.org/jira/browse/FLINK-4712
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Domokos Miklós Kelen
>            Assignee: Gábor Hermann
>
> We started working on implementing ranking predictions for recommender 
> systems. Ranking prediction means that beside predicting scores for user-item 
> pairs, the recommender system is able to recommend a top K list for the users.
> Details:
> In practice, this would mean finding the K items for a particular user with 
> the highest predicted rating. It should be possible also to specify whether 
> to exclude the already seen items from a particular user's toplist. (See for 
> example the 'exclude_known' setting of [Graphlab Create's ranking 
> factorization 
> recommender|https://turi.com/products/create/docs/generated/graphlab.recommender.ranking_factorization_recommender.RankingFactorizationRecommender.recommend.html#graphlab.recommender.ranking_factorization_recommender.RankingFactorizationRecommender.recommend]
>  ).
> The output of the topK recommendation function could be in the form of 
> {{DataSet[(Int,Int,Int)]}}, meaning (user, item, rank), similar to Graphlab 
> Create's output. However, this is arguable: follow up work includes 
> implementing ranking recommendation evaluation metrics (such as precision@k, 
> recall@k, ndcg@k), similar to [Spark's 
> implementations|https://spark.apache.org/docs/1.5.0/mllib-evaluation-metrics.html#ranking-systems].
>  It would be beneficial if we were able to design the API such that it could 
> be included in the proposed evaluation framework (see 
> [5157|https://issues.apache.org/jira/browse/FLINK-2157]), which makes it 
> neccessary to consider the possible output type {{DataSet[(Int, 
> Array[Int])]}} or {{DataSet[(Int, Array[(Int,Double)])]}} meaning (user, 
> array of items), possibly including the predicted scores as well. See 
> [4713|https://issues.apache.org/jira/browse/FLINK-4713] for details.
> Another question arising is whether to provide this function as a member of 
> the ALS class, as a switch-kind of parameter to the ALS implementation 
> (meaning the model is either a rating or a ranking recommender model) or in 
> some other way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to