Yes, I like to work on standardizing the code around input formats.
On Mon, Mar 3, 2014 at 7:37 PM, Suneel Marthi <suneel_mar...@yahoo.com>wrote: > To get things moving for 1.0: > > > a) Address the 4 issues that Sean had raised - we have already started > looking at Backlog and closing them, started looking at converting old > MapReduce to newer MapReduce API. > > If someone could start looking at standardizing the input/output > formats across classifiers, clustering and recommenders that would be > great. Guess Frank S. has already started work in that direction. > > b) Need a better and cleaner serialized form of Vectors to handle names > and other kind'a stuff, this is gonna impact everything that's presently > implemented. > > c) Agree with ssc, to start looking at Spark-Mahout integration. > > > d) Need volunteers to QA/address issues with the present > classifiers/clustering algorithms. I personally can vouch for how > disastrous it is to deploy any of Mahout's classifiers/clustering > implementations in an Operations environment. A good example of that is > Sean's recent patch for RDF. > > Naive Bayes code as it is now seems half-baked and is incomplete. Not > every code path has been tested on Streaming KMeans. > > This should go some way in addressing the technical debt that's been piled > over the years. > > > > > > On Monday, March 3, 2014 1:05 PM, Sebastian Schelter <s...@apache.org> > wrote: > > I would like to discuss whether we should start to have some > Spark-related code in Mahout. > > --sebastian > > > On 03/03/2014 06:56 PM, Suneel Marthi wrote: > > Grant had setup a Google Hangout for Mahout sometime last year before > 0.8 release. I had one setup too for 0.9 release. I definitely wouldn't > want to have a hangout on Saturday or weekend. > > > > > > > > > > > > On Monday, March 3, 2014 12:52 PM, Ted Dunning <ted.dunn...@gmail.com> > wrote: > > > > Happy to organize a google hangout. That has the advantage of allowing > more attendees and supporting YouTube archiving. > > > > Sent from my iPhone > > > > > >> On Mar 3, 2014, at 9:34, Giorgio Zoppi <giorgio.zo...@gmail.com> wrote: > >> > >> Hello All, > >> Dr.Dunning could you set a meeting next Sat morning, so we can chat and > >> discuss by skype improvements and what to do and indentify volunteer and > >> tasks. > >> Best Regards, > >> Giorgio > >> > >> > >> 2014-03-03 18:30 GMT+01:00 peng <pc...@uowmail.edu.au>: > >> > >>> Me three > >>> > >>> > >>>> On Sun 02 Mar 2014 11:45:33 AM EST, Ted Dunning wrote: > >>>> > >>>> Ravi, > >>>> > >>>> Good points. > >>>> > >>>> On Sun, Mar 2, 2014 at 12:38 AM, Ravi Mummulla < > ravi.mummu...@gmail.com> > >>>> wrote: > >>>> > >>>> - Natively support Windows (guidance, etc. No documentation exists > today, > >>>>> for instance) > >>>> There is a bit of demand for that. > >>>> > >>>> - Faster time to first application (from discovery to first > application > >>>> > >>>>> currently takes a non-trivial amount of effort; how can we lower the > bar > >>>>> and reduce the friction for adoption?) > >>>> There is huge evidence that this is important. > >>>> > >>>> > >>>> - Better documenting use cases with working samples/examples > >>>>> (Documentation > >>>>> on https://mahout.apache.org/users/basics/algorithms.html is spread > out > >>>>> and > >>>>> there is too much focus on algorithms as opposed to use cases - this > is > >>>>> an > >>>>> adoption > blocker) > >>>> This is also important. > >>>> > >>>> > >>>> - Uniformity of the API set across all algorithms (are we providing > the > >>>>> same experience across all APIs?) > >>>> And many people have been tripped up by this. > >>>> > >>>> > >>>> - Measuring/publishing scalability metrics of various algorithms > (why > >>>>> would > >>>>> we want users to adopt Mahout vs. other frameworks for ML at scale?) > >>>> I don't see this as important as some of your other points, but is > still > >>>> useful. > >> > >> > >> -- > >> Quiero ser el rayo de sol que cada día te despierta > >> para hacerte respirar y vivir en me. > >> "Favola -Moda". >