Here's the stuff I've been working on in 0.4:
* Map/Reduce job to compute the pairwise similarities of the rows of a
matrix using a customizable similarity measure (with implementations
already provided for cooccurrence, euclidean distance, loglikelihood,
pearson correlation, tanimoto-coefficient, cosine)
* Map/Reduce job to compute the item-item-similarities for itembased
collaborative filtering
* RecommenderJob has been evolved to a fully distributed itembased
recommender
-sebastian
On 19.10.2010 16:30, Jeff Eastman wrote:
On 10/19/10 7:00 AM, Sean Owen wrote:
I've even lost track of what the big-ticket changes have been since 0.3. I'm
compiling 7-8 bullet points for the release notes, as I am going through the
release process now.
Would anyone please volunteer some bullet points? I don't want to miss
anything and want to describe it correctly. I'll do my best to fill in what
seems missing.
For clustering, here's a few:
* Model refactoring and CLI changes to improve integration and
consistency
* New ClusterEvaluator and CDbwClusterEvaluator offer new ways to
evaluate clustering effectiveness
* New Spectral Clustering and MinHash Clustering from GSoC (still
experimental)
* New VectorModelClassifier allows any set of clusters to be used
for classification