Under Release Highlights, please also add:

a) Dan's Streaming kmeans clustering.
b) Mahout upgrade to be Lucene 4.3.0 compatible 


(both of the above deserve special mentions along with lucene2seq and 
vector/matrix performance improvements).



________________________________
 From: Grant Ingersoll <gsing...@apache.org>
To: d...@mahout.apache.org; s...@apache.org 
Cc: user@mahout.apache.org 
Sent: Saturday, June 8, 2013 1:33 PM
Subject: Re: [DRAFT] 0.8 Release Announcement + Future Plans Discussion
 


On Jun 8, 2013, at 1:26 PM, Sebastian Schelter <s...@apache.org> wrote:

> Hi Grant,
> 
> Very good release announcement. I propose that we deprecate a lot more,
> I think we should be aggressive here to pave the way for a clean and
> slim 1.0 release.
> 
> I propose to additionally deprecate the following algorithms, as to my
> state of knowledge, they are not actively used:
> 
> Collaborative Filtering:
> 
> - all recommenders in o.a.m.cf.taste.impl.recommender.knn
> 
> - the TreeClusteringRecommender in o.a.m.cf.taste.impl.recommender
> 
> - the SlopeOne implementations in o.a.m..cf.taste.hadoop.slopeone and
> o.a.m.cf.taste.impl.recommender.slopeone
> 
> - the distributed pseudo recommender in o.a.m.cf.taste.hadoop.pseudo

Pseudo is useful, no?  Don't know about the others.

> 
> Classification:
> 
> - the Hidden Markov Models in o.a.m.classifier.sequencelearning.hmm

We have some parallel training stuff coming, so I'd say -1 here, as I think 
HMMs are pretty important, no?

> 
> Clustering
> 
> - Fuzzy k-Means o.a.m.clustering.fuzzykmeans
> - Spectral k-Means in o.a.m.clustering.spectral

-1 on spectral being dropped as that seems to receive decent traction.

Not sure on Fuzzy, as I think it is a pretty trivial extension of K-Means.

> 
> Math
> 
> - the tooling in o.a.m.math.stats.entropy
> 
> Furthermore, I think we should deprecate the Lanczos implementation in
> o.a.m.math.hadoop.decomposer and port all code that uses it to SSVD.

No opinion.

+1 on everything else.

> 
> To all users and other committers, this is a biased first proposal,
> please shout, if you see things different and want to have things kept.
> 
> Best,
> Sebastian
> 
> 
> On 08.06.2013 16:42, Grant Ingersoll wrote:
>> More tests are always welcome.
>> 
>> On Jun 8, 2013, at 10:29 AM, Ravi Mummulla <ravi.mummu...@gmail.com> wrote:
>> 
>>> Hi Grant,
>>> Regarding 1.0 plans, do we also want to include a note on adding tests
>>> where they don't exist or improving them where needed or is that implicit?
>>> 
>>> Thanks.
>>> 
>>> 
>>> On Sat, Jun 8, 2013 at 3:55 AM, Grant Ingersoll <gsing...@apache.org> wrote:
>>> 
>>>> Hi Mahouts,
>>>> 
>>>> A full copy of proposed draft release notes are up at
>>>> https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8.  Please
>>>> add/edit as appropriate.
>>>> 
>>>> IN PARTICULAR, PLEASE PAY CLOSE ATTENTION TO THE SECTION LABELLED __FUTURE
>>>> PLANS__, which I have included below.  This is purely my own opinion, but I
>>>> think it reflects conversations I've had w/ both Robin and Sebastian at
>>>> Berlin Buzzwords.   I'm also interested in opinions on my proposed
>>>> deprecation plan (which I haven't discussed with anyone) which is put forth
>>>> in the 1.0 plans below.
>>>> 
>>>> --------------------------  DRAFT -------------------------
>>>> FUTURE PLANS
>>>> 
>>>> 0.9
>>>> 
>>>> As the project moves towards a 1.0 release, the community is working to
>>>> clean up and/or remove parts of the code base that are under-supported or
>>>> that underperform as well as to better focus the energy and contributions
>>>> on key algorithms that are proven to scale in production and have seen
>>>> wide-spread adoption.  To this end, in the next release, the project is
>>>> planning on removing support for the following algorithms unless there is
>>>> sustained support and improvement of them before the next release.
>>>> 
>>>> The algorithms to be removed are:
>>>> - From Clustering:
>>>>       Dirichlet
>>>>       MeanShift
>>>>       MinHash
>>>> - From Classification (both are sequential implementations)
>>>>       Winnow
>>>>       Perceptron
>>>> - Frequent Pattern Mining
>>>> - Collaborative Filtering
>>>>       GSI: DO ANY GO HERE?
>>>> - Other
>>>>       GSI: ANYTHING?
>>>> 
>>>> If you are interested in supporting 1 or more of these algorithms, please
>>>> make it known on d...@mahout.apache.org and via JIRA issues that fix
>>>> and/or improve them.  Please also provide supporting evidence as to there
>>>> effectiveness for you in production.
>>>> 
>>>> 1.0 PLANS
>>>> 
>>>> Our plans as a community are to focus 0.9 on cleanup of bugs and the
>>>> removal of the code mentioned above and then to follow with a 1.0 release
>>>> soon thereafter, at which point the community is committing to the support
>>>> of the algorithms packaged in the 1.0 for at least two minor versions after
>>>> their release.  In the case of removal, we will deprecate the functionality
>>>> in the 1.(x+1) minor release and remove it in the 1.(x+2) release.  For
>>>> instance, if feature X is to be removed after the 1.2 release, it will be
>>>> deprecated in 1.3 and removed in 1.4.
>>>> 
>>>> ------------------- DRAFT ----------------------
>>>> 
>>>> -Grant
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Thanks.
>> 
>> --------------------------------------------
>> Grant Ingersoll | @gsingers
>> http://www.lucidworks.com
>> 
>> 
>> 
>> 
>> 
>> 
> 

--------------------------------------------
Grant Ingersoll | @gsingers
http://www.lucidworks.com

Reply via email to