Re: [scikit-learn] Accessing Clustering Feature Tree in Birch

2017-10-02 Thread Roman Yurchak
Hello, sklearn.cluster.Birch follows the original BIRCH paper, that appears to be mostly focused on efficiently building the hierarchical clustering tree (and not so much on making the later analysis user friendly). The attributes exposed by Birch are those that could be reasonably exposed gi

Re: [scikit-learn] TF-IDF

2017-10-02 Thread Roman Yurchak
Hi Apurva, if you consider the operations done by the augmented frequency and the cosine normalization independently from everything else, they are somewhat similar. The normalization by max in a p-norm with p→+āˆž . So apart from the 0.5 offset, both are can be seen document length normalizati