[ https://issues.apache.org/jira/browse/OPENNLP-777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cohan Sujay Carlos updated OPENNLP-777: --------------------------------------- Attachment: (was: naive-bayesian-classifier-for-opennlp-1.6.0-rc6-with-test-cases.patch) > Naive Bayesian Classifier > ------------------------- > > Key: OPENNLP-777 > URL: https://issues.apache.org/jira/browse/OPENNLP-777 > Project: OpenNLP > Issue Type: New Feature > Components: Machine Learning > Environment: J2SE 1.5 and above > Reporter: Cohan Sujay Carlos > Assignee: Tommaso Teofili > Priority: Minor > Labels: NBClassifier, bayes, bayesian, classifier, multinomial, > naive, patch > Attachments: NaiveBayesCorrectnessTest.java, > naive-bayes-classifier-2-adding-fixes-requested-by-joern-on-20-oct-2015.patch, > topics.train > > Original Estimate: 504h > Remaining Estimate: 504h > > I thought it would be nice to have a Naive Bayesian classifier in OpenNLP (it > lacks one at present). > Implementation details: We have a production-hardened piece of Java code for > a multinomial Naive Bayesian classifier (with default Laplace smoothing) that > we'd like to contribute. The code is Java 1.5 compatible. I'd have to write > an adapter to make the interface compatible with the ME classifier in > OpenNLP. I expect the patch to be available 1 to 3 weeks from now. > Below is the email trail of a discussion in the dev mailing list around this > dated May 19th, 2015. > <snip> > Tommaso Teofili via opennlp.apache.org > to dev > Hi Cohan, > I think that'd be a very valuable contribution, as NB is one of the > foundation algorithms, often used as basis for comparisons. > It would be good if you could create a Jira issue and provide more details > about the implementation and, eventually, a patch. > Thanks and regards, > Tommaso > </snip> > 2015-05-19 9:57 GMT+02:00 Cohan Sujay Carlos > > I have a question for the OpenNLP project team. > > > > I was wondering if there is a Naive Bayesian classifier implementation in > > OpenNLP that I've not come across, or if there are plans to implement one. > > > > If it is the latter, I should love to contribute an implementation. > > > > There is an ME classifier already available in OpenNLP, of course, but I > > felt that there was an unmet need for a Naive Bayesian (NB) classifier > > implementation to be offered as well. > > > > An NB classifier could be bootstrapped up with partially labelled training > > data as explained in the Nigam, McCallum, et al paper of 2000 "Text > > Classification from Labeled and Unlabeled Documents using EM". > > > > So, if there isn't an NB code base out there already, I'd be happy to > > contribute a very solid implementation that we've used in production for a > > good 5 years. > > > > I'd have to adapt it to load the same training data format as the ME > > classifier, but I guess that shouldn't be very difficult to do. > > > > I was wondering if there was some interest in adding an NB implementation > > and I'd love to know who could I coordinate with if there is? > > > > Cohan Sujay Carlos > > CEO, Aiaioo Labs, India -- This message was sent by Atlassian JIRA (v6.3.4#6332)