[ https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064027#comment-14064027 ]
Andrew Palumbo commented on MAHOUT-1493: ---------------------------------------- [~cviebig], The patch looks good. I've made some edits (against your branch) and will attach an M-1493a.patch shortly. I put the trainComplementary parameter back in as this is needed to make the distinction between Standard and Complementary Models in the NaiveBayesModel constructor. As well, I've added a thetaNormalizer var which can remain null when passed to the NaiveBayesModel constructor unless training a Complementary NB model. see: https://github.com/apache/mahout/blob/master/mrlegacy/src/main/java/org/apache/mahout/classifier/naivebayes/NaiveBayesModel.java I'm not sure if creating a null var as I've done here is best practice in scala, but i wanted to give you an idea of the NaiveBayesModel design. As you've noted, there has been a lot of refactoring going on. As far as moving the code, I think that for now it might be a good idea to keep this in `spark` module, and move the `org.apache.mahout.sparkbindings.drm.classification` package out of `org.apache.mahout.sparkbindings.drm` and into a new `org.apache.mahout.classification` package. I believe that for now this would be a good place for it. There shouldn't be any need to move any of the java code from mrlegacy. > Port Naive Bayes to the Spark DSL > --------------------------------- > > Key: MAHOUT-1493 > URL: https://issues.apache.org/jira/browse/MAHOUT-1493 > Project: Mahout > Issue Type: Bug > Components: Classification > Reporter: Sebastian Schelter > Assignee: Sebastian Schelter > Fix For: 1.0 > > Attachments: 1493a.patch, MAHOUT-1493.patch, MAHOUT-1493.patch, > MAHOUT-1493.patch, MAHOUT-1493.patch > > > Port our Naive Bayes implementation to the new spark dsl. Shouldn't require > more than a few lines of code. -- This message was sent by Atlassian JIRA (v6.2#6252)