If there were an effort to write in C++ , it would definitely be useful
and to exploit the maximum advantages, porting would be more beneficial
over time compared to the wrapper, even if it were to apply to a subset
of algorithms supported by Mahout. Wrapper, would serve the syntactic
purpose, but when it comes to profiling / performance extraction would
be a huge distraction then.
But, as been pointed earlier - the algorithm depends on the M-R
framework very much and hence , the success of this effort would also be
tied to the Hadoop C/C++ port's maturity as well. Something worth noting
before venturing along these lines.
On 02/04/2010 09:22 AM, Atul Kulkarni wrote:
Hey guys,
My 1 cent...
I would be really happy to contribute to this task of enabling use of Mahout
via C++ (Wrapper / Port either way). I have some experience with C++ and
have been wanting to use mahout via C++ (as that is my comfort zone compared
to Java.).
I think port will give the code directly in the hands of the C++ developers,
which sounds really exciting to me as a C++ developer. But I also understand
the concern of maintaining two different code bases for the same task, and
hence also like the idea of writing wrappers. So I am divided on the two
options, either works for me.
Regards,
Atul.
On Thu, Feb 4, 2010 at 10:54 AM, Robin Anil<robin.a...@gmail.com> wrote:
Hi Israel. I think its a wonderful idea to have ports of mahout, it tells
us
that we have a great platform with people really want to use. The only
concern is Hadoop is still in Java and they are not going with C++. They
work around it by using native libraries to execute cpu intensive tasks
like
sorting and compressing. The reason being that Java is much easier to
manage
in such a distributed system(i guess lot of people may differ in opinion).
Regardless, I guess wrappers could be made to ease execution of mahout
algorithms from any language. If thats a solution you like then folks here
can concentrate on improving just one code base.
Robin
On Thu, Feb 4, 2010 at 10:08 PM, Israel Ekpo<israele...@gmail.com> wrote:
Hey guys,
First of all I would like to start by thanking all the commiters and
contributors for all their hard work so far on this project.
Most importantly, I want to thank the Apache Mahout community for
bringing
this very promising project to where it is now.
It's pretty amazing to see what the project has accomplished in a short
span
of 2 years.
I strongly believe that Apache Mahout is really going to change things
around for the data mining and machine learning community the same way
Apache Lucene and Apache Solr is taking over this sector as we speak.
Currently Apache Mahout is only available in Java and there are a lot of
tools in Mahout that is very useful and a lot of people (students,
instructors, researchers and computer scientists are using it daily).
I think it would be nice if all of these tools in Mahout were also
available
in C++ so that users that already have systems written in C++ can plug in
an
integrate Mahout a lot easier with their existing or planned C++ systems.
If we have the C++ port up and running possibly more members of the data
mining and machine learning community could get involved and ideas could
be
shuffled in both directions (Java and C++ port)
I will volunteer to spearhead this porting effort to get things started.
I am sending this message to all members of the Apache Mahout community
on
what you think can should be done to get this porting effort up and
running.
Thanks in advance for you constructive and anticipated responses.
Sincerely,
Israel Ekpo
--
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/