Grant, Would the TLP be Mahout or under a different name?
I also like the idea that it does not necessarily have to be a 1:1 port. Kay Kay, I change my mind (going the wrapper route), I think it would be nice to explore the possibilities with just a subset of the algorithms. That would be a good place to start. I will be in touch On Feb 5, 2010, at 03:23 PM, Grant Ingersoll wrote: One thought on these lines is that we should start the process to be a TLP, then we could have a subproject explicitly dedicated to C++ (or any other language) and there wouldn't necessarily need to be a 1-1 port. -Grant On Feb 5, 2010, at 12:56 AM, Kay Kay wrote: If there were an effort to write in C++ , it would definitely be useful and to exploit the maximum advantages, porting would be more beneficial over time compared to the wrapper, even if it were to apply to a subset of algorithms supported by Mahout. Wrapper, would serve the syntactic purpose, but when it comes to profiling / performance extraction would be a huge distraction then. But, as been pointed earlier - the algorithm depends on the M-R framework very much and hence , the success of this effort would also be tied to the Hadoop C/C++ port's maturity as well. Something worth noting before venturing along these lines. On Fri, Feb 5, 2010 at 3:41 PM, Israel Ekpo <israele...@gmail.com> wrote: > Thanks everyone for your responses so far. > > The Apache Hadoop dependency was something I thought about initially but I > still went ahead to ask the question anyways. > > At this time, it would be a better use of resources and time to come up > with a wrapper or HTTP server/client set up of some sort. > > My reasoning behind this is because of the Hadoop dependency and the > volatile nature of the API as pointed out by Sean and Robin > > Thanks again for all your responses. > > > On Thu, Feb 4, 2010 at 12:22 PM, Atul Kulkarni <atulskulka...@gmail.com>wrote: > >> Hey guys, >> >> My 1 cent... >> >> I would be really happy to contribute to this task of enabling use of >> Mahout >> via C++ (Wrapper / Port either way). I have some experience with C++ and >> have been wanting to use mahout via C++ (as that is my comfort zone >> compared >> to Java.). >> >> I think port will give the code directly in the hands of the C++ >> developers, >> which sounds really exciting to me as a C++ developer. But I also >> understand >> the concern of maintaining two different code bases for the same task, and >> hence also like the idea of writing wrappers. So I am divided on the two >> options, either works for me. >> >> Regards, >> Atul. >> >> On Thu, Feb 4, 2010 at 10:54 AM, Robin Anil <robin.a...@gmail.com> wrote: >> >> > Hi Israel. I think its a wonderful idea to have ports of mahout, it >> tells >> > us >> > that we have a great platform with people really want to use. The only >> > concern is Hadoop is still in Java and they are not going with C++. They >> > work around it by using native libraries to execute cpu intensive tasks >> > like >> > sorting and compressing. The reason being that Java is much easier to >> > manage >> > in such a distributed system(i guess lot of people may differ in >> opinion). >> > >> > Regardless, I guess wrappers could be made to ease execution of mahout >> > algorithms from any language. If thats a solution you like then folks >> here >> > can concentrate on improving just one code base. >> > >> > Robin >> > >> > On Thu, Feb 4, 2010 at 10:08 PM, Israel Ekpo <israele...@gmail.com> >> wrote: >> > >> > > Hey guys, >> > > >> > > First of all I would like to start by thanking all the commiters and >> > > contributors for all their hard work so far on this project. >> > > >> > > Most importantly, I want to thank the Apache Mahout community for >> > bringing >> > > this very promising project to where it is now. >> > > >> > > It's pretty amazing to see what the project has accomplished in a >> short >> > > span >> > > of 2 years. >> > > >> > > I strongly believe that Apache Mahout is really going to change things >> > > around for the data mining and machine learning community the same way >> > > Apache Lucene and Apache Solr is taking over this sector as we speak. >> > > >> > > Currently Apache Mahout is only available in Java and there are a lot >> of >> > > tools in Mahout that is very useful and a lot of people (students, >> > > instructors, researchers and computer scientists are using it daily). >> > > >> > > I think it would be nice if all of these tools in Mahout were also >> > > available >> > > in C++ so that users that already have systems written in C++ can plug >> in >> > > an >> > > integrate Mahout a lot easier with their existing or planned C++ >> systems. >> > > >> > > If we have the C++ port up and running possibly more members of the >> data >> > > mining and machine learning community could get involved and ideas >> could >> > be >> > > shuffled in both directions (Java and C++ port) >> > > >> > > I will volunteer to spearhead this porting effort to get things >> started. >> > > >> > > I am sending this message to all members of the Apache Mahout >> community >> > on >> > > what you think can should be done to get this porting effort up and >> > > running. >> > > >> > > Thanks in advance for you constructive and anticipated responses. >> > > >> > > Sincerely, >> > > Israel Ekpo >> > > >> > > -- >> > > "Good Enough" is not good enough. >> > > To give anything less than your best is to sacrifice the gift. >> > > Quality First. Measure Twice. Cut Once. >> > > http://www.israelekpo.com/ >> > > >> > >> >> >> >> -- >> Regards, >> Atul Kulkarni >> www.d.umn.edu/~kulka053 <http://www.d.umn.edu/%7Ekulka053> >> > > > > -- > "Good Enough" is not good enough. > To give anything less than your best is to sacrifice the gift. > Quality First. Measure Twice. Cut Once. > http://www.israelekpo.com/ > -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/