[ 
https://issues.apache.org/jira/browse/MAHOUT-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Palumbo updated MAHOUT-1519:
-----------------------------------

    Attachment: MAHOUT-1519.patch

I thought it was going to be a simpler to remove all references to the 
thetaNormalizer vector for Standard NB models without making too many changes, 
but its in there pretty deep.  To completely remove any thetaNormalizer 
references i added a field to NaiveBayesModel to explicitly define it as 
complementary or standard (probably not a bad thing) in order to deal with all 
of the Serialization, validation differences, etc.  This way the constructor 
can take a null value for the thetaNormalizer vector.

I made alot of changes, so I went back and rewrote a simpler/hackish patch 
which doesn't touch NaiveBayesModel, but that one is much less stable and can 
not accept a null thetaNormalizer. 

Let me know if there's too many changes here, and I'll submit that one. 


> Remove StandardThetaTrainer
> ---------------------------
>
>                 Key: MAHOUT-1519
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1519
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification
>            Reporter: Sebastian Schelter
>             Fix For: 1.0
>
>         Attachments: MAHOUT-1519.patch
>
>
> [~Andrew_Palumbo] if I understand your work in MAHOUT-1504 correctly, the 
> theta training is only necessary for complementary naive bayes, right?
> Then, we should remove the StandardthetaTrainer and make the 
> TrainNaiveBayesJob only do the theta training in the complementary case.
> Correct me if I miss something here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to