First steps towards the "loving care" (in my view) :-

a) Address the issues that Sean's brought
 up. I wasn't aware of (i) in that list else I would have ensured that they 
were addressed in 0.9.

b) Most of the backlog JIRAs (about 28 of them today) go all the way back to 
the initial stages of Mahout's evolution (pre 0.5).  Some of them may just have 
to be closed and resolved as "Will not do" or "Times Immemorial".

c) Fix algorithms that presently have half-baked code in them like Naive Bayes 
classifier (why is the thetaSummer commented out - either we don't need it or 
does it need fixing?),  Streaming KMeans - lacks adequate test coverage and 
still fails along the different paths and the same goes for other clustering 
algorithms too.







On Friday, February 28, 2014 3:30 PM, Andrew Musselman 
<andrew.mussel...@gmail.com> wrote:
 
>
> >
> > To be constructive, here are four items that seem more important for
> > something like "1.0.0" and are even a lot less work:
> >
> > - Use Hadoop .mapreduce API consistently
> > - Standardize input output formats of all jobs
> > - Remove use of deprecated
 code
> > - Clear even a third of the open JIRA backlog
> >
>
> Like i said, i believe the future is in moving ahead, build on strengths
> and finding unique proposition. I agree with the above in a sense  that
> out-of-core stuff that runs over MR could use some unification. I know you
> have done a lot in that department and I assume since you are writing to
> dev list, you are looking to help with that going
 forward. Cause if  not...
> the dev lists are not exactly created to be an open forum for just giving
> lectures.
>

Can we agree that before we put an integer version on Mahout that it needs
some tender-loving care, and that we can still have high hopes?

Reply via email to