[
https://issues.apache.org/jira/browse/MAHOUT-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Palumbo updated MAHOUT-1519:
-----------------------------------
Attachment: MAHOUT-1519.patch
I thought it was going to be a simpler to remove all references to the
thetaNormalizer vector for Standard NB models without making too many changes,
but its in there pretty deep. To completely remove any thetaNormalizer
references i added a field to NaiveBayesModel to explicitly define it as
complementary or standard (probably not a bad thing) in order to deal with all
of the Serialization, validation differences, etc. This way the constructor
can take a null value for the thetaNormalizer vector.
I made alot of changes, so I went back and rewrote a simpler/hackish patch
which doesn't touch NaiveBayesModel, but that one is much less stable and can
not accept a null thetaNormalizer.
Let me know if there's too many changes here, and I'll submit that one.
> Remove StandardThetaTrainer
> ---------------------------
>
> Key: MAHOUT-1519
> URL: https://issues.apache.org/jira/browse/MAHOUT-1519
> Project: Mahout
> Issue Type: Improvement
> Components: Classification
> Reporter: Sebastian Schelter
> Fix For: 1.0
>
> Attachments: MAHOUT-1519.patch
>
>
> [~Andrew_Palumbo] if I understand your work in MAHOUT-1504 correctly, the
> theta training is only necessary for complementary naive bayes, right?
> Then, we should remove the StandardthetaTrainer and make the
> TrainNaiveBayesJob only do the theta training in the complementary case.
> Correct me if I miss something here.
--
This message was sent by Atlassian JIRA
(v6.2#6252)