[ https://issues.apache.org/jira/browse/SPARK-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037211#comment-15037211 ]
Caique Rodrigues Marques edited comment on SPARK-8855 at 12/7/15 4:10 PM: -------------------------------------------------------------------------- I'm working on this feature right now, but I've a doubt. The description of the issue says that a important method is "FPGrowthModel.generateAssociationRules()". However, isn't clear if a wrapper for the association rules will be available in "FPGrowthModelWrapper.scala" - and this is my doubt. My current idea on how to implement this feature is the following: 1) Create the class "Association Rules" inside the "fpm.py" file within the following declarations: 1.1) Method train(data, minConfidence), that will generate the association rules for a data with a minConfidence specified (0.6 default). This method will call the "trainAssociationRules" from the PythonMLLibAPI with the parameters data and minConfidence. Returns a FPGrowthModel. 1.2) Class Rule, that will be a namedtuple and represents a antecedent or consequent tuple. 2) Add the method generateAssociationRules to FPGrowthModel class (inside fpm.py). This method will map the Rules generated (calling the method "getAssociationRule" from FPGrowthModelWrapper) to the namedtuple. Now comes my real problem: how to make trainAssociationRules return a FGrowthModel to the Wrapper, so the Wrapper can map the rule received to the antecedent/consequent? I can't make trainAssociationRules returns a FPGrowthModel. The wrapper for association rules is in FPGrowthModelWrapper, right? Something wrong with this idea? For illustration, I think something like this in PythonMLLibAPI and in FPGrowthModelWrapper, respectively: {code:none} // PythonMLLibAPI.scala def trainAssociationRules( data: JavaRDD[FPGrowth.FreqItemset[Any]], minConfidence: Double): [return type] = { val model = new FPGrowthModel(data.rdd) .generateAssociationRules(minConfidence) new FPGrowthModelWrapper(model) // will fail } ----------------------------------------------------------------------- // FPGrowthModelWrapper.scala def getAssociationRules: [return type] = { SerDe.fromTuple2RDD(rule.map(x => (x.javaAntecedent, x.javaConsequent))) } {code} Any suggestions? was (Author: caique): I am working on this, but I found a doubt. Following the description of the issue, it says that a important method is "FPGrowthModel.generateAssociationRules()", of course. However, is not clear if a wrapper for the association rules it will be in "FPGrowthModelWrapper.scala" and this is the problem. My idea is the following: 1) In the fpm.py file; class "Association Rules" with one method and a class: 1.1) Method train(data, minConfidence), that will generate the association rules for a data with a minConfidence specified (0.6 default). This method will call the "trainAssociationRules" from the PythonMLLibAPI with the parameters data and minConfidence. Later. will return a FPGrowthModel. 1.2) Class Rule, that will a namedtuple, represents an (antecedent, consequent) tuple. 2) Still in fpm.py, in the class FPGrowthModel, a new method will be added, called generateAssociationRules, that will map the Rules generated calling the method "getAssociationRule" from FPGrowthModelWrapper to the namedtuple. Now is my doubt, how to make trainAssociationRules returns a FGrowthModel to the Wrapper just maps the rule received to the antecedent/consequent? I could not do the method trainAssociationRules returns a FPGrowthModel. The wrapper for association rules is in FPGrowthModelWrapper, right? Something wrong with the idea? For illustration, I think something like this in PythonMLLibAPI and in FPGrowthModelWrapper, respectively: {code:none} // PythonMLLibAPI.scala def trainAssociationRules( data: JavaRDD[FPGrowth.FreqItemset[Any]], minConfidence: Double): [return type] = { val model = new FPGrowthModel(data.rdd) .generateAssociationRules(minConfidence) new FPGrowthModelWrapper(model) // will fail } ----------------------------------------------------------------------- // FPGrowthModelWrapper.scala def getAssociationRules: [return type] = { SerDe.fromTuple2RDD(rule.map(x => (x.javaAntecedent, x.javaConsequent))) } {code} Any suggestions? > Python API for Association Rules > -------------------------------- > > Key: SPARK-8855 > URL: https://issues.apache.org/jira/browse/SPARK-8855 > Project: Spark > Issue Type: New Feature > Components: MLlib > Reporter: Feynman Liang > Priority: Minor > > A simple Python wrapper and doctests needs to be written for Association > Rules. The relevant method is {{FPGrowthModel.generateAssociationRules}}. The > code will likely live in {{fpm.py}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org