+1 On Mar 25, 2013, at 4:43 AM, Manuel Blechschmidt <[email protected]> wrote:
> Hello, > > On 25.03.2013, at 09:10, Sebastian Schelter wrote: > >> Hi, >> >> throwing in my 2 cents here: >> >> I don't agree that we simply lack manpower but have a clear vision. I >> actually think its the other way round. I think Mahout is kind of stuck, >> because it does not have a clear vision. > > I fully agree. So I think Mahout needs a vision. The big problem about ML is > that you can do everything with it but to make a difference you have to focus. > > I am using Mahout for solving business problems e.g.: > > - Online fraud > - eCommerce recommendations > - Demand forecasting > > One big piece that is missing for all the algorithms is a complete bundled > data set that is solving a real business problem and with bundled I mean that > it is in the Mahout source tree. If no real data is available generated data > could be used. > > I tried to fill this gap for recommendations with my github project: > > https://github.com/ManuelB/facebook-recommender-demo > > This project seams to be used by the community. You can get it, compile it > and start it with 4 commands. > >> ... >> >> It is also my personal experience (= I heard it over and over again from >> our users) that it is extremely hard to get started with Mahout using >> the available documentation. MiA is the exception to this, but people >> have to buy it first and it lacks a lot of the latest developments. It >> would be awesome to have a reworked wiki that is qualitatively >> comparable to MiA. > > So this is the nature of a framework. If you really want people to get > started easily you have to provide a full blown example where you can just > replace the example data with your data. > > I don't think that enough manpower can be acquired to create a visual GUI for > Mahout. Further I don't think that this would help. There are already > excellent GUIs for ML e.g. Weka (http://www.cs.waikato.ac.nz/ml/weka/) and > RStudio (http://www.rstudio.com/) > > >> >> Best, >> Sebastian > > Hope this helps > Manuel > >> >> On 25.03.2013 07:29, Isabel Drost-Fromm wrote: >>> >>> >>> On Monday, March 25, 2013 07:22:46 AM Isabel Drost-Fromm wrote: >>>> On Sunday, March 24, 2013 05:38:00 PM Grant Ingersoll wrote: >>>>> On Mar 24, 2013, at 5:03 PM, Isabel Drost-Fromm wrote: >>>>>> What about an experiment: If you (reading this mail) were to write a two >>>>>> sentence vision statement for Mahout as you see it - what would that be? >>>>> >>>>> Produce open source, scalable machine learning code using a community >>>>> development model. >>>> >>>> So taking that apart: >>>> >>>> - Hadoop is not necessarily part of the equation. All that we promise are >>>> implemenations that are reasonably scalable. >>> >>> - We play well with small-ish (fits in memory) and large (fits only in >>> memory of >>> many machines) or huge (fits only on disk) datasets. >>> >>>> - There is no restriction in there wrt. supporting only specific use cases >>>> - >>>> in particular no restriction to be recommendations only. >>>> >>>> - There is no restriction to "only batch" or "only online" learning. >>>> >>>> If we want to be that broad we definitely lack lots of people, I think. >>>> >>>> The other question that I cannot answer today: Do we want to be a Java >>>> Library that people link with their project, a standalone program that >>>> people interact with via the command line, a basis that people can easily >>>> integrate into their Pig/Hive/Cascalog/Scalding/Cascading/what-ever-else >>>> workflows or all of these? > > -- > Manuel Blechschmidt > M.Sc. IT Systems Engineering > Dortustr. 57 > 14467 Potsdam > Mobil: 0173/6322621 > Twitter: http://twitter.com/Manuel_B >
