[ https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14088415#comment-14088415 ]
ASF GitHub Bot commented on MAHOUT-1493: ---------------------------------------- Github user dlyubimov commented on a diff in the pull request: https://github.com/apache/mahout/pull/32#discussion_r15908947 --- Diff: spark/src/main/scala/org/apache/mahout/sparkbindings/drm/classification/NaiveBayes.scala --- @@ -0,0 +1,74 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.mahout.sparkbindings.drm.classification + +import org.apache.mahout.math.drm._ +import org.apache.mahout.math.scalabindings +import org.apache.mahout.math.scalabindings._ +import org.apache.mahout.classifier.naivebayes.NaiveBayesModel +import org.apache.mahout.classifier.naivebayes.training.ComplementaryThetaTrainer + +import scala.reflect.ClassTag + +/** + * Distributed training of a Naive Bayes model. Follows the approach presented in Rennie et.al.: Tackling the poor + * assumptions of Naive Bayes Text classifiers, ICML 2003, http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf + */ +object NaiveBayes { + + /** default value for the smoothing parameter */ + def defaultAlphaI = 1f --- End diff -- Mahout convention is to write these as `1.0` rather than a float. > Port Naive Bayes to the Spark DSL > --------------------------------- > > Key: MAHOUT-1493 > URL: https://issues.apache.org/jira/browse/MAHOUT-1493 > Project: Mahout > Issue Type: Bug > Components: Classification > Reporter: Sebastian Schelter > Assignee: Sebastian Schelter > Fix For: 1.0 > > Attachments: MAHOUT-1493.patch, MAHOUT-1493.patch, MAHOUT-1493.patch, > MAHOUT-1493.patch, MAHOUT-1493a.patch > > > Port our Naive Bayes implementation to the new spark dsl. Shouldn't require > more than a few lines of code. -- This message was sent by Atlassian JIRA (v6.2#6252)