You can now save a random forest and use it to classify new data.

On Tue, Oct 19, 2010 at 3:40 PM, Sebastian Schelter <[email protected]> wrote:
> Here's the stuff I've been working on in 0.4:
>
>  * Map/Reduce job to compute the pairwise similarities of the rows of a
> matrix using a customizable similarity measure (with implementations already
> provided for cooccurrence, euclidean distance, loglikelihood, pearson
> correlation, tanimoto-coefficient, cosine)
>  * Map/Reduce job to compute the item-item-similarities for itembased
> collaborative filtering
>  * RecommenderJob has been evolved to a fully distributed itembased
> recommender
>
> -sebastian
>
> On 19.10.2010 16:30, Jeff Eastman wrote:
>>
>> On 10/19/10 7:00 AM, Sean Owen wrote:
>>>
>>> I've even lost track of what the big-ticket changes have been since 0.3.
>>> I'm
>>> compiling 7-8 bullet points for the release notes, as I am going through
>>> the
>>> release process now.
>>>
>>> Would anyone please volunteer some bullet points? I don't want to miss
>>> anything and want to describe it correctly. I'll do my best to fill in
>>> what
>>> seems missing.
>>>
>>>
>>
>> For clustering, here's a few:
>>
>>    * Model refactoring and CLI changes to improve integration and
>>      consistency
>>    * New ClusterEvaluator and CDbwClusterEvaluator offer new ways to
>>      evaluate clustering effectiveness
>>    * New Spectral Clustering and MinHash Clustering from GSoC (still
>>      experimental)
>>    * New VectorModelClassifier allows any set of clusters to be used
>>      for classification
>>
>
>

Reply via email to