Re: Proposing a C++ Port for Apache Mahout

Kay Kay Sat, 06 Feb 2010 17:21:58 -0800

On 02/05/2010 01:48 PM, Israel Ekpo wrote:

Grant,


Would the TLP be Mahout or under a different name?

I also like the idea that it does not necessarily have to be a 1:1 port.

Kay Kay,

I change my mind (going the wrapper route), I think it would be nice to
explore the possibilities with just a subset of the algorithms.

That would be a good place to start.

I will be in touch

Sure, Israel. Meanwhile as been pointed earlier in a different thread -it would be useful to do an informal case study of competing algorithmsbefore starting, for reference.


Best of luck on starting this one up !

On Feb 5, 2010, at 03:23 PM, Grant Ingersoll wrote:

One thought on these lines is that we should start the process to be a TLP,
then we could have
a subproject explicitly dedicated to C++ (or any other language) and there
wouldn't necessarily
need to be a 1-1 port.

-Grant

On Feb 5, 2010, at 12:56 AM, Kay Kay wrote:

If there were an effort to write in C++ , it would definitely be useful and
to exploit
the maximum advantages, porting would be more beneficial over time compared
to the wrapper,
even if it were to apply to a subset of algorithms supported by Mahout.
Wrapper, would serve
the syntactic purpose, but when it comes to profiling / performance
extraction would be a
huge distraction then.

  But, as been pointed earlier - the algorithm depends on the M-R framework
very much and
hence , the success of this effort would also be tied to the Hadoop C/C++
port's maturity
as well. Something worth noting before venturing along these lines.


On Fri, Feb 5, 2010 at 3:41 PM, Israel Ekpo<[email protected]>  wrote:

Thanks everyone for your responses so far.

The Apache Hadoop dependency was something I thought about initially but I
still went ahead to ask the question anyways.

At this time, it would be a better use of resources and time to come up
with a wrapper or HTTP server/client set up of some sort.

My reasoning behind this is because of the Hadoop dependency and the
volatile nature of the API as pointed out by Sean and Robin

Thanks again for all your responses.


On Thu, Feb 4, 2010 at 12:22 PM, Atul Kulkarni<[email protected]>wrote:

Hey guys,

My 1 cent...

I would be really happy to contribute to this task of enabling use of
Mahout
via C++ (Wrapper / Port either way). I have some experience with C++ and
have been wanting to use mahout via C++ (as that is my comfort zone
compared
to Java.).

I think port will give the code directly in the hands of the C++
developers,
which sounds really exciting to me as a C++ developer. But I also
understand
the concern of maintaining two different code bases for the same task, and
hence also like the idea of writing wrappers. So I am divided on the two
options, either works for me.

Regards,
Atul.

On Thu, Feb 4, 2010 at 10:54 AM, Robin Anil<[email protected]>  wrote:

Hi Israel. I think its a wonderful idea to have ports of mahout, it

tells

us
that we have a great platform with people really want to use. The only
concern is Hadoop is still in Java and they are not going with C++. They
work around it by using native libraries to execute cpu intensive tasks
like
sorting and compressing. The reason being that Java is much easier to
manage
in such a distributed system(i guess lot of people may differ in

opinion).

Regardless, I guess wrappers could be made to ease execution of mahout
algorithms from any language. If thats a solution you like then folks

here

can concentrate on improving just one code base.

Robin

On Thu, Feb 4, 2010 at 10:08 PM, Israel Ekpo<[email protected]>

wrote:

Hey guys,

First of all I would like to start by thanking all the commiters and
contributors for all their hard work so far on this project.

Most importantly, I want to thank the Apache Mahout community for

bringing

this very promising project to where it is now.

It's pretty amazing to see what the project has accomplished in a

short

span
of 2 years.

I strongly believe that Apache Mahout is really going to change things
around for the data mining and machine learning community the same way
Apache Lucene and Apache Solr is taking over this sector as we speak.

Currently Apache Mahout is only available in Java and there are a lot

of

tools in Mahout that is very useful and a lot of people (students,
instructors, researchers and computer scientists are using it daily).

I think it would be nice if all of these tools in Mahout were also
available
in C++ so that users that already have systems written in C++ can plug

in

an
integrate Mahout a lot easier with their existing or planned C++

systems.

If we have the C++ port up and running possibly more members of the

data

mining and machine learning community could get involved and ideas

could

be

shuffled in both directions (Java and C++ port)

I will volunteer to spearhead this porting effort to get things

started.

I am sending this message to all members of the Apache Mahout

community

on

what you think can should be done to get this porting effort up and
running.

Thanks in advance for you constructive and anticipated responses.

Sincerely,
Israel Ekpo

--
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/



--
Regards,
Atul Kulkarni
www.d.umn.edu/~kulka053<http://www.d.umn.edu/%7Ekulka053>



--
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Proposing a C++ Port for Apache Mahout

Reply via email to