Hello. As part of `org.apache.spark.ml.feature.IDFModel`, I think it is a good idea to also expose:
1. Document frequency vector 2. Number of documents We get the above for free currently and they just need to be exposed as public val. This avoids re-implementation for someone who needs to compute DocumentFrequency of terms. Currently if someone needs df, then one would need to reverse compute it based on the idf values obtained. Afaik, we dont explicitly provide such a functionality in mllib. And we don't need to have a separate class, if we can expose it in `IDFModel` itself. Does it sound alright? Regards, Jatin