You can now save a random forest and use it to classify new data.
On Tue, Oct 19, 2010 at 3:40 PM, Sebastian Schelter <[email protected]> wrote: > Here's the stuff I've been working on in 0.4: > > * Map/Reduce job to compute the pairwise similarities of the rows of a > matrix using a customizable similarity measure (with implementations already > provided for cooccurrence, euclidean distance, loglikelihood, pearson > correlation, tanimoto-coefficient, cosine) > * Map/Reduce job to compute the item-item-similarities for itembased > collaborative filtering > * RecommenderJob has been evolved to a fully distributed itembased > recommender > > -sebastian > > On 19.10.2010 16:30, Jeff Eastman wrote: >> >> On 10/19/10 7:00 AM, Sean Owen wrote: >>> >>> I've even lost track of what the big-ticket changes have been since 0.3. >>> I'm >>> compiling 7-8 bullet points for the release notes, as I am going through >>> the >>> release process now. >>> >>> Would anyone please volunteer some bullet points? I don't want to miss >>> anything and want to describe it correctly. I'll do my best to fill in >>> what >>> seems missing. >>> >>> >> >> For clustering, here's a few: >> >> * Model refactoring and CLI changes to improve integration and >> consistency >> * New ClusterEvaluator and CDbwClusterEvaluator offer new ways to >> evaluate clustering effectiveness >> * New Spectral Clustering and MinHash Clustering from GSoC (still >> experimental) >> * New VectorModelClassifier allows any set of clusters to be used >> for classification >> > >
