Saw a really awesome shark tank talk today at ApacheCon.

Had a conversation after and wanted to follow up.

The Apache MADlib-incubator project is Machine Learning on SQL. (also close
to graduation as I understand)

The Apache Mahout project is engine neutral roll your own machine learning
/ statistical algorithms (with a quickly increasing cannon of 'precanned'
algorithms).

(Both projects have a lot of other cool tricks, but let's table that for
now).

Based on a one off discussion, it is highly likely that the 'hard part' of
writing engine bindings in Mahout, has already been done by MADlib as a
course of business. (That is linear algebra like operations on 'matrices'
backed by SQL).

Mahout also brings some cool things like GPU acceleration to the table.
(FYI Mahout GPU, as I understand is CPP at the low level, just to get your
wheels turning) (MADlib project, Mahout uses JavaCPP and other Java
wrappers for CPP libraries at the very low level for implementing GPU
acceleration)

There are numerous more benefits I can think of- but that's the high level
so everyone on each project gets the jist of it.

I think an integration (MADLib based SQL bindings, for lack of better term)
is a potentially an easy win that would yield big advantages for both
projects, and would like to propose some exploratory collaboration.

"Roll your own GPU accelerated statistical algorithms on PostgreSQL and
other SQL engines- brought to you by Apache Mahout+ Apache
MADlib-incubator" - or Apache MADlib-incubator + Apache Mahout, depending
on who is giving the conference talk ;)

Encouraging anyone interested to sign up for the appropriate dev list.

Reply via email to