Haider, Thanks for offering to help out. I looked over your code on Github. Unfortunately, it will need a lot of refactoring before we can use any of it.
But I might need your help coding up some adaptors and will keep you in the loop. Thanks again for offering to help out. Cohan On Wed, May 20, 2015 at 1:27 AM, Haider Ali <[email protected]> wrote: > Hello Everyone > > Naive based classifier is a good initiative and i would like to contribute > to this patch. I implemented this algorithm about a year ago and its on my > git Naive-Bayes-classifier > <https://github.com/wonderer007/Naive-Bayes-classifier>. I am very new to > the open source contribution and looking for help about How to Contribute. > > Thanks > > On Tue, May 19, 2015 at 6:51 PM, Cohan Sujay Carlos <[email protected]> > wrote: > > > Tommaso, > > > > I have created the Jira issue: > > https://issues.apache.org/jira/browse/OPENNLP-777 > > > > The details of the Java version compatibility and the classifier's > > internals are as follows: > > > > "Implementation details: We have a production-hardened piece of Java code > > for a multinomial Naive Bayesian classifier (with default Laplace > > smoothing) that we'd like to contribute. The code is Java 1.5 compatible. > > I'd have to write an adapter to make the interface compatible with the ME > > classifier in OpenNLP. I expect the patch to be available 1 to 3 weeks > from > > now." > > > > This is the default configuration but the code is well-refactored and you > > can actually plug in any smoothing algorithm and any feature set. It also > > has some support for succinct memory models, and I later plan to add a > > multivariate bernoulli implementation as well (I wanted to start with the > > multinomial version because the advantages of the multinomial model will > > make it the better performer for most NLP projects). > > > > I could not figure out how to assign the issue to myself. The patch will > > be available 1 to 3 weeks from now. > > > > Thanks and regards, > > > > Cohan Sujay Carlos > > > > > > On Tue, May 19, 2015 at 5:26 PM, Tommaso Teofili < > > [email protected]> > > wrote: > > > > > Hi Cohan, > > > > > > I think that'd be a very valuable contribution, as NB is one of the > > > foundation algorithms, often used as basis for comparisons. > > > It would be good if you could create a Jira issue and provide more > > details > > > about the implementation and, eventually, a patch. > > > > > > Thanks and regards, > > > Tommaso > > > > > > 2015-05-19 9:57 GMT+02:00 Cohan Sujay Carlos <[email protected]>: > > > > > > > I have a question for the OpenNLP project team. > > > > > > > > I was wondering if there is a Naive Bayesian classifier > implementation > > in > > > > OpenNLP that I've not come across, or if there are plans to implement > > > one. > > > > > > > > If it is the latter, I should love to contribute an implementation. > > > > > > > > There is an ME classifier already available in OpenNLP, of course, > but > > I > > > > felt that there was an unmet need for a Naive Bayesian (NB) > classifier > > > > implementation to be offered as well. > > > > > > > > An NB classifier could be bootstrapped up with partially labelled > > > training > > > > data as explained in the Nigam, McCallum, et al paper of 2000 "Text > > > > Classification from Labeled and Unlabeled Documents using EM". > > > > > > > > So, if there isn't an NB code base out there already, I'd be happy to > > > > contribute a very solid implementation that we've used in production > > for > > > a > > > > good 5 years. > > > > > > > > I'd have to adapt it to load the same training data format as the ME > > > > classifier, but I guess that shouldn't be very difficult to do. > > > > > > > > I was wondering if there was some interest in adding an NB > > implementation > > > > and I'd love to know who could I coordinate with if there is? > > > > > > > > Cohan Sujay Carlos > > > > CEO, Aiaioo Labs, India > > > > +91-77605-80015 +91-80-4125-0730 > > > > > > > > > > > > > -- > Haider Ali > National University of Computer and Emerging Sciences Lahore >
