Note that there are a few examples of decent C++ machine learning & data mining. All different licenses etc:
1) http://ai.stanford.edu/users/ronnyk/mlc.html : Public domain 2) http://www.sgi.com/tech/mlc/ : Enhanced and research-only version of #1 3) http://waffles.sourceforge.net/ : LGPL 4) http://plearn.berlios.de/ : BSD - being actively developed 5) http://shark-project.sourceforge.net/ 6) http://dlib.net/ : Hsitorcal Permission and Disclaimer 7) http://www.torch.ch : BSD And these are just a few excluding ones using the GPL. I guess I'm pointing out that there are lots of people that have been down this road before. The idea of having Mahout avail via a remote HTTP is pretty cool... but that is more general purpose than a C++ port. Also note the lesson of CLucene. They've done a great job with their C++ port, I used it to gut the internals of HtDig (leaving only the original webspider, configuration, and C/PHP APIs). However, they are constantly chasing the Java Lucene port and only keeps up in fits and starts. - Neal Richter, a mahout lurker On Fri, Feb 5, 2010 at 1:41 PM, Israel Ekpo <israele...@gmail.com> wrote: > Thanks everyone for your responses so far. > > The Apache Hadoop dependency was something I thought about initially but I > still went ahead to ask the question anyways. > > At this time, it would be a better use of resources and time to come up with > a wrapper or HTTP server/client set up of some sort. > > My reasoning behind this is because of the Hadoop dependency and the > volatile nature of the API as pointed out by Sean and Robin > > Thanks again for all your responses. > > On Thu, Feb 4, 2010 at 12:22 PM, Atul Kulkarni <atulskulka...@gmail.com>wrote: > >> Hey guys, >> >> My 1 cent... >> >> I would be really happy to contribute to this task of enabling use of >> Mahout >> via C++ (Wrapper / Port either way). I have some experience with C++ and >> have been wanting to use mahout via C++ (as that is my comfort zone >> compared >> to Java.). >> >> I think port will give the code directly in the hands of the C++ >> developers, >> which sounds really exciting to me as a C++ developer. But I also >> understand >> the concern of maintaining two different code bases for the same task, and >> hence also like the idea of writing wrappers. So I am divided on the two >> options, either works for me. >> >> Regards, >> Atul. >> >> On Thu, Feb 4, 2010 at 10:54 AM, Robin Anil <robin.a...@gmail.com> wrote: >> >> > Hi Israel. I think its a wonderful idea to have ports of mahout, it tells >> > us >> > that we have a great platform with people really want to use. The only >> > concern is Hadoop is still in Java and they are not going with C++. They >> > work around it by using native libraries to execute cpu intensive tasks >> > like >> > sorting and compressing. The reason being that Java is much easier to >> > manage >> > in such a distributed system(i guess lot of people may differ in >> opinion). >> > >> > Regardless, I guess wrappers could be made to ease execution of mahout >> > algorithms from any language. If thats a solution you like then folks >> here >> > can concentrate on improving just one code base. >> > >> > Robin >> > >> > On Thu, Feb 4, 2010 at 10:08 PM, Israel Ekpo <israele...@gmail.com> >> wrote: >> > >> > > Hey guys, >> > > >> > > First of all I would like to start by thanking all the commiters and >> > > contributors for all their hard work so far on this project. >> > > >> > > Most importantly, I want to thank the Apache Mahout community for >> > bringing >> > > this very promising project to where it is now. >> > > >> > > It's pretty amazing to see what the project has accomplished in a short >> > > span >> > > of 2 years. >> > > >> > > I strongly believe that Apache Mahout is really going to change things >> > > around for the data mining and machine learning community the same way >> > > Apache Lucene and Apache Solr is taking over this sector as we speak. >> > > >> > > Currently Apache Mahout is only available in Java and there are a lot >> of >> > > tools in Mahout that is very useful and a lot of people (students, >> > > instructors, researchers and computer scientists are using it daily). >> > > >> > > I think it would be nice if all of these tools in Mahout were also >> > > available >> > > in C++ so that users that already have systems written in C++ can plug >> in >> > > an >> > > integrate Mahout a lot easier with their existing or planned C++ >> systems. >> > > >> > > If we have the C++ port up and running possibly more members of the >> data >> > > mining and machine learning community could get involved and ideas >> could >> > be >> > > shuffled in both directions (Java and C++ port) >> > > >> > > I will volunteer to spearhead this porting effort to get things >> started. >> > > >> > > I am sending this message to all members of the Apache Mahout community >> > on >> > > what you think can should be done to get this porting effort up and >> > > running. >> > > >> > > Thanks in advance for you constructive and anticipated responses. >> > > >> > > Sincerely, >> > > Israel Ekpo >> > > >> > > -- >> > > "Good Enough" is not good enough. >> > > To give anything less than your best is to sacrifice the gift. >> > > Quality First. Measure Twice. Cut Once. >> > > http://www.israelekpo.com/ >> > > >> > >> >> >> >> -- >> Regards, >> Atul Kulkarni >> www.d.umn.edu/~kulka053 <http://www.d.umn.edu/%7Ekulka053> >> > > > > -- > "Good Enough" is not good enough. > To give anything less than your best is to sacrifice the gift. > Quality First. Measure Twice. Cut Once. > http://www.israelekpo.com/ >