Hello, On 25.03.2013, at 09:10, Sebastian Schelter wrote:
> Hi, > > throwing in my 2 cents here: > > I don't agree that we simply lack manpower but have a clear vision. I > actually think its the other way round. I think Mahout is kind of stuck, > because it does not have a clear vision. I fully agree. So I think Mahout needs a vision. The big problem about ML is that you can do everything with it but to make a difference you have to focus. I am using Mahout for solving business problems e.g.: - Online fraud - eCommerce recommendations - Demand forecasting One big piece that is missing for all the algorithms is a complete bundled data set that is solving a real business problem and with bundled I mean that it is in the Mahout source tree. If no real data is available generated data could be used. I tried to fill this gap for recommendations with my github project: https://github.com/ManuelB/facebook-recommender-demo This project seams to be used by the community. You can get it, compile it and start it with 4 commands. > ... > > It is also my personal experience (= I heard it over and over again from > our users) that it is extremely hard to get started with Mahout using > the available documentation. MiA is the exception to this, but people > have to buy it first and it lacks a lot of the latest developments. It > would be awesome to have a reworked wiki that is qualitatively > comparable to MiA. So this is the nature of a framework. If you really want people to get started easily you have to provide a full blown example where you can just replace the example data with your data. I don't think that enough manpower can be acquired to create a visual GUI for Mahout. Further I don't think that this would help. There are already excellent GUIs for ML e.g. Weka (http://www.cs.waikato.ac.nz/ml/weka/) and RStudio (http://www.rstudio.com/) > > Best, > Sebastian Hope this helps Manuel > > On 25.03.2013 07:29, Isabel Drost-Fromm wrote: >> >> >> On Monday, March 25, 2013 07:22:46 AM Isabel Drost-Fromm wrote: >>> On Sunday, March 24, 2013 05:38:00 PM Grant Ingersoll wrote: >>>> On Mar 24, 2013, at 5:03 PM, Isabel Drost-Fromm wrote: >>>>> What about an experiment: If you (reading this mail) were to write a two >>>>> sentence vision statement for Mahout as you see it - what would that be? >>>> >>>> Produce open source, scalable machine learning code using a community >>>> development model. >>> >>> So taking that apart: >>> >>> - Hadoop is not necessarily part of the equation. All that we promise are >>> implemenations that are reasonably scalable. >> >> - We play well with small-ish (fits in memory) and large (fits only in >> memory of >> many machines) or huge (fits only on disk) datasets. >> >>> - There is no restriction in there wrt. supporting only specific use cases - >>> in particular no restriction to be recommendations only. >>> >>> - There is no restriction to "only batch" or "only online" learning. >>> >>> If we want to be that broad we definitely lack lots of people, I think. >>> >>> The other question that I cannot answer today: Do we want to be a Java >>> Library that people link with their project, a standalone program that >>> people interact with via the command line, a basis that people can easily >>> integrate into their Pig/Hive/Cascalog/Scalding/Cascading/what-ever-else >>> workflows or all of these? >> >> > -- Manuel Blechschmidt M.Sc. IT Systems Engineering Dortustr. 57 14467 Potsdam Mobil: 0173/6322621 Twitter: http://twitter.com/Manuel_B