Note that there are a few examples of decent C++ machine learning &
data mining.  All different licenses etc:

1) http://ai.stanford.edu/users/ronnyk/mlc.html : Public domain
2) http://www.sgi.com/tech/mlc/ : Enhanced and research-only version of #1
3) http://waffles.sourceforge.net/ : LGPL
4) http://plearn.berlios.de/ : BSD - being actively developed
5) http://shark-project.sourceforge.net/
6) http://dlib.net/ : Hsitorcal Permission and Disclaimer
7) http://www.torch.ch : BSD

And these are just a few excluding ones using the GPL.  I guess I'm
pointing out that there are lots of people that have been down this
road before.

The idea of having Mahout avail via a remote HTTP is pretty cool...
but that is more general purpose than a C++ port.

Also note the lesson of CLucene.  They've done a great job with their
C++ port, I used it to gut the internals of HtDig (leaving only the
original webspider, configuration, and C/PHP APIs).  However, they are
constantly chasing the Java Lucene port and only keeps up in fits and
starts.

- Neal Richter, a mahout lurker

On Fri, Feb 5, 2010 at 1:41 PM, Israel Ekpo <israele...@gmail.com> wrote:
> Thanks everyone for your responses so far.
>
> The Apache Hadoop dependency was something I thought about initially but I
> still went ahead to ask the question anyways.
>
> At this time, it would be a better use of resources and time to come up with
> a wrapper or HTTP server/client set up of some sort.
>
> My reasoning behind this is because of the Hadoop dependency and the
> volatile nature of the API as pointed out by Sean and Robin
>
> Thanks again for all your responses.
>
> On Thu, Feb 4, 2010 at 12:22 PM, Atul Kulkarni <atulskulka...@gmail.com>wrote:
>
>> Hey guys,
>>
>> My 1 cent...
>>
>> I would be really happy to contribute to this task of enabling use of
>> Mahout
>> via C++ (Wrapper / Port either way). I have some experience with C++ and
>> have been wanting to use mahout via C++ (as that is my comfort zone
>> compared
>> to Java.).
>>
>> I think port will give the code directly in the hands of the C++
>> developers,
>> which sounds really exciting to me as a C++ developer. But I also
>> understand
>> the concern of maintaining two different code bases for the same task, and
>> hence also like the idea of writing wrappers. So I am divided on the two
>> options, either works for me.
>>
>> Regards,
>> Atul.
>>
>> On Thu, Feb 4, 2010 at 10:54 AM, Robin Anil <robin.a...@gmail.com> wrote:
>>
>> > Hi Israel. I think its a wonderful idea to have ports of mahout, it tells
>> > us
>> > that we have a great platform with people really want to use. The only
>> > concern is Hadoop is still in Java and they are not going with C++. They
>> > work around it by using native libraries to execute cpu intensive tasks
>> > like
>> > sorting and compressing. The reason being that Java is much easier to
>> > manage
>> > in such a distributed system(i guess lot of people may differ in
>> opinion).
>> >
>> > Regardless, I guess wrappers could be made to ease execution of mahout
>> > algorithms from any language. If thats a solution you like then folks
>> here
>> > can concentrate on improving just one code base.
>> >
>> > Robin
>> >
>> > On Thu, Feb 4, 2010 at 10:08 PM, Israel Ekpo <israele...@gmail.com>
>> wrote:
>> >
>> > > Hey guys,
>> > >
>> > > First of all I would like to start by thanking all the commiters and
>> > > contributors for all their hard work so far on this project.
>> > >
>> > > Most importantly, I want to thank the Apache Mahout community for
>> > bringing
>> > > this very promising project to where it is now.
>> > >
>> > > It's pretty amazing to see what the project has accomplished in a short
>> > > span
>> > > of 2 years.
>> > >
>> > > I strongly believe that Apache Mahout is really going to change things
>> > > around for the data mining and machine learning community the same way
>> > > Apache Lucene and Apache Solr is taking over this sector as we speak.
>> > >
>> > > Currently Apache Mahout is only available in Java and there are a lot
>> of
>> > > tools in Mahout that is very useful and a lot of people (students,
>> > > instructors, researchers and computer scientists are using it daily).
>> > >
>> > > I think it would be nice if all of these tools in Mahout were also
>> > > available
>> > > in C++ so that users that already have systems written in C++ can plug
>> in
>> > > an
>> > > integrate Mahout a lot easier with their existing or planned C++
>> systems.
>> > >
>> > > If we have the C++ port up and running possibly more members of the
>> data
>> > > mining and machine learning community could get involved and ideas
>> could
>> > be
>> > > shuffled in both directions (Java and C++ port)
>> > >
>> > > I will volunteer to spearhead this porting effort to get things
>> started.
>> > >
>> > > I am sending this message to all members of the Apache Mahout community
>> > on
>> > > what you think can should be done to get this porting effort up and
>> > > running.
>> > >
>> > > Thanks in advance for you constructive and anticipated responses.
>> > >
>> > > Sincerely,
>> > > Israel Ekpo
>> > >
>> > > --
>> > > "Good Enough" is not good enough.
>> > > To give anything less than your best is to sacrifice the gift.
>> > > Quality First. Measure Twice. Cut Once.
>> > > http://www.israelekpo.com/
>> > >
>> >
>>
>>
>>
>> --
>> Regards,
>> Atul Kulkarni
>> www.d.umn.edu/~kulka053 <http://www.d.umn.edu/%7Ekulka053>
>>
>
>
>
> --
> "Good Enough" is not good enough.
> To give anything less than your best is to sacrifice the gift.
> Quality First. Measure Twice. Cut Once.
> http://www.israelekpo.com/
>

Reply via email to