When computing term frequency, we can use either HashTF or CountVectorizer feature extractors. However, both of them just use the number of times that a term appears in a document. It is not a true frequency. Acutally, it should be divided by the length of the document.
Is this a wanted feature ? -- Hao Ren Data Engineer @ leboncoin Paris, France