
ASF GitHub Bot commented on FLINK-4712:

Github user gaborhermann commented on the issue:

    Thanks again for taking a look at our PR!
    I've just realized from a developer mailing list thread that the FlinkML 
API is still not carved into stone even until 2.0, and it's nice to hear that :)
    The problem is not with the `evaluate(test: TestType): DataSet[Double]` but 
rather with `evaluate(test: TestType): DataSet[(Prediction,Prediction)]`. It's 
at least confusing to have both, but it might not be worth to expose the one 
giving `(Prediction,Prediction)` pairs to the user as it only *prepares* 
evaluation. With introducing the evaluation framework, we could at least rename 
it to something like `preparePairwiseEvaluation(test: TestType): 
DataSet[(Prediction,Prediction)]`. In the ranking case we might generalize it 
to `prepareEvaluation(test: TestType): PreparedTesting`. We basically did this 
with the `PrepareDataSetOperation`, we've just left the old `evaluate` as it is 
for now. I suggest to change this if we can break the API.
    I'll do a rebase on the cross-validation PR. At first glance, it should not 
really be a problem to do both cross-validation and hyper-parameter tuning, as 
the user has to provide a `Scorer` anyway. A minor issue I see is the user 
falling back to a default `score` (e.g. RMSE in case of ALS). This might not be 
a problem for recommendation models that give rating predictions beside ranking 
predictions, but it's a problem for models that *only* give ranking 
predictions, because those do not extend the `Predictor` class. This is not an 
issue for now, but might be a problem when adding more recommendation models. 
Should we try and do this now or is it a bit "overengineering"? I'll see if any 
other problem comes up with after rebasing.
    The `RankingPredictor` interface is useful *internally* for the `Score`s. 
It serves a contract between a `RankingScore` and the model. I'm sure it will 
be used only for recommendations, but it's no effort exposing it, so the user 
can write code using a general `RankingPredictor` (although I would not think 
this is what users would like to do :) ). A better question is whether to use 
it in a `Pipeline`. We discussed this with some people, and could not really 
find a use-case where we need a `Transformer`-like preprocessing for 
recommendations. Of course, there could be other preprocessing steps, such as 
removing/aggregating duplicates, but those do not have to be `fit` to training 
data. Based on this, it's not worth the effort to integrate `RankingPredictor` 
with the `Pipeline`, at least for now.

> Implementing ranking predictions for ALS
> ----------------------------------------
>                 Key: FLINK-4712
>                 URL: https://issues.apache.org/jira/browse/FLINK-4712
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Domokos Miklós Kelen
>            Assignee: Gábor Hermann
> We started working on implementing ranking predictions for recommender 
> systems. Ranking prediction means that beside predicting scores for user-item 
> pairs, the recommender system is able to recommend a top K list for the users.
> Details:
> In practice, this would mean finding the K items for a particular user with 
> the highest predicted rating. It should be possible also to specify whether 
> to exclude the already seen items from a particular user's toplist. (See for 
> example the 'exclude_known' setting of [Graphlab Create's ranking 
> factorization 
> recommender|https://turi.com/products/create/docs/generated/graphlab.recommender.ranking_factorization_recommender.RankingFactorizationRecommender.recommend.html#graphlab.recommender.ranking_factorization_recommender.RankingFactorizationRecommender.recommend]
>  ).
> The output of the topK recommendation function could be in the form of 
> {{DataSet[(Int,Int,Int)]}}, meaning (user, item, rank), similar to Graphlab 
> Create's output. However, this is arguable: follow up work includes 
> implementing ranking recommendation evaluation metrics (such as precision@k, 
> recall@k, ndcg@k), similar to [Spark's 
> implementations|https://spark.apache.org/docs/1.5.0/mllib-evaluation-metrics.html#ranking-systems].
>  It would be beneficial if we were able to design the API such that it could 
> be included in the proposed evaluation framework (see 
> [5157|https://issues.apache.org/jira/browse/FLINK-2157]), which makes it 
> neccessary to consider the possible output type {{DataSet[(Int, 
> Array[Int])]}} or {{DataSet[(Int, Array[(Int,Double)])]}} meaning (user, 
> array of items), possibly including the predicted scores as well. See 
> [4713|https://issues.apache.org/jira/browse/FLINK-4713] for details.
> Another question arising is whether to provide this function as a member of 
> the ALS class, as a switch-kind of parameter to the ALS implementation 
> (meaning the model is either a rating or a ranking recommender model) or in 
> some other way.

This message was sent by Atlassian JIRA

Reply via email to