Hi Subbu,  

There is currently no way to update an already trained Naive Bayes Model.  
You'd have to retrain on the full 2 million records.  

You could probably hack TrainNaiveBayesJob.java [1] to meet your needs if you 
anticipated this as something that you'd need to do in the future, but your new 
data will have to be vectorized in the exact same manner as the original data 
to update the model correctly- this would limit you to pure term frequencies 
(no IDF transformation) and would not allow for anything like maxDFPercent, etc.

Andy

[1]https://github.com/apache/mahout/blob/master/mrlegacy/src/main/java/org/apache/mahout/classifier/naivebayes/training/TrainNaiveBayesJob.java


> Hi team,
> I have trained a model in naive Bayes using training data of 1 million
> records. Now I have another 1 million records . Can I add this new training
> data to the existing model and train it again to get a new model instead of
> passing all the 2 million records at once to get a model.
>
> Thanks,
> Subbu
>
                                          

Reply via email to