[ https://issues.apache.org/jira/browse/SPARK-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225728#comment-14225728 ]
Daniel Erenrich edited comment on SPARK-4001 at 11/26/14 4:39 AM: ------------------------------------------------------------------ I was about to start coding something like this when I noticed this ticket. What's the status here? Association rule algorithms in general (and apriori in particular) are useful in collaborative filtering contexts (which mllib already has code for). As far as library cohesiveness, my though here is that we can frame the inputs to look near identical to the matrix facotarization code though with (basket_id, item_id) instead of (user_id, item_id). That input format would be inefficient though (so maybe we'd support a second more natural input format). This though would sidestep the concern the sklearn folks had. was (Author: derenrich): I was about to start coding something like this when I noticed this ticket. What's the status here? Association rule algorithms in general (and apriori in particular) are useful in collaborative filtering contexts (which mllib already has code for). As far as library cohesiveness, my though here is that we can frame the inputs to look near identical to the matrix facotarization code though with (basket_id, item_id) instead of (user_id, item_id). That input format would be inefficient though (so maybe we'd support a second more natural input format. This though would sidestep the concern the sklearn folks had. > Add Apriori algorithm to Spark MLlib > ------------------------------------ > > Key: SPARK-4001 > URL: https://issues.apache.org/jira/browse/SPARK-4001 > Project: Spark > Issue Type: New Feature > Components: MLlib > Reporter: Jacky Li > Assignee: Jacky Li > > Apriori is the classic algorithm for frequent item set mining in a > transactional data set. It will be useful if Apriori algorithm is added to > MLLib in Spark -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org