
On 25.03.2013, at 09:10, Sebastian Schelter wrote:

> Hi,
> throwing in my 2 cents here:
> I don't agree that we simply lack manpower but have a clear vision. I
> actually think its the other way round. I think Mahout is kind of stuck,
> because it does not have a clear vision.

I fully agree. So I think Mahout needs a vision. The big problem about ML is 
that you can do everything with it but to make a difference you have to focus.

I am using Mahout for solving business problems e.g.:

- Online fraud
- eCommerce recommendations
- Demand forecasting

One big piece that is missing for all the algorithms is a complete bundled data 
set that is solving a real business problem and with bundled I mean that it is 
in the Mahout source tree. If no real data is available generated data could be 

I tried to fill this gap for recommendations with my github project:


This project seams to be  used by the community. You can get it, compile it and 
start it with 4 commands.

> ...
> It is also my personal experience (= I heard it over and over again from
> our users) that it is extremely hard to get started with Mahout using
> the available documentation. MiA is the exception to this, but people
> have to buy it first and it lacks a lot of the latest developments. It
> would be awesome to have a reworked wiki that is qualitatively
> comparable to MiA.

So this is the nature of a framework. If you really want people to get started 
easily you have to provide a full blown example where you can just replace the 
example data with your data.

I don't think that enough manpower can be acquired to create a visual GUI for 
Mahout. Further I don't think that this would help. There are already excellent 
GUIs for ML e.g. Weka (http://www.cs.waikato.ac.nz/ml/weka/) and RStudio 

> Best,
> Sebastian

Hope this helps

> On 25.03.2013 07:29, Isabel Drost-Fromm wrote:
>> On Monday, March 25, 2013 07:22:46 AM Isabel Drost-Fromm wrote:
>>> On Sunday, March 24, 2013 05:38:00 PM Grant Ingersoll wrote:
>>>> On Mar 24, 2013, at 5:03 PM, Isabel Drost-Fromm wrote:
>>>>> What about an experiment: If you (reading this mail) were to write a two
>>>>> sentence vision statement for Mahout as you see it - what would that be?
>>>> Produce open source, scalable machine learning code using a community
>>>> development model.
>>> So taking that apart:
>>> - Hadoop is not necessarily part of the equation. All that we promise are
>>> implemenations that are reasonably scalable.
>> - We play well with small-ish (fits in memory) and large (fits only in 
>> memory of 
>> many machines) or huge (fits only on disk) datasets.
>>> - There is no restriction in there wrt. supporting only specific use cases -
>>> in particular no restriction to be recommendations only.
>>> - There is no restriction to "only batch" or "only online" learning.
>>> If we want to be that broad we definitely lack lots of people, I think.
>>> The other question that I cannot answer today: Do we want to be a Java
>>> Library that people link with their project, a standalone program that
>>> people interact with via the command line, a basis that people can easily
>>> integrate into their Pig/Hive/Cascalog/Scalding/Cascading/what-ever-else
>>> workflows or all of these?

Manuel Blechschmidt
M.Sc. IT Systems Engineering
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B

Reply via email to