When computing term frequency, we can use either HashTF or CountVectorizer
feature extractors.
However, both of them just use the number of times that a term appears in a
document.
It is not a true frequency. Acutally, it should be divided by the length of
the document.

Is this a wanted feature ?

-- 
Hao Ren

Data Engineer @ leboncoin

Paris, France

Reply via email to