Can you show how to do IDF transform on tfWithId? Thanks.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/TF-IDF-in-Spark-1-1-0-tp16389p20877.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
Thanks for the response. Appreciate the help!
Burke
On Tue, Oct 14, 2014 at 3:00 PM, Xiangrui Meng wrote:
> You cannot recover the document from the TF-IDF vector, because
> HashingTF is not reversible. You can assign each document a unique ID,
> and join back the result after training. Hasing
You cannot recover the document from the TF-IDF vector, because
HashingTF is not reversible. You can assign each document a unique ID,
and join back the result after training. HasingTF can transform
individual record:
val docs: RDD[(String, Seq[String])] = ...
val tf = new HashingTF()
val tfWithI
I'm following the Mllib example for TF-IDF and ran into a problem due to my
lack of knowledge of Scala and spark. Any help would be greatly
appreciated.
Following the Mllib example I could do something like this:
import org.apache.spark.rdd.RDD
import org.apache.spark.SparkContext
import org.apa