For those still interested, I raised this issue on JIRA and received an
official response:
https://issues.apache.org/jira/browse/SPARK-6340
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/order-preservation-with-RDDs-tp22052p22088.html
Sent from the Apache
,
where (correct me if I'm wrong) there is no built-in mechanism to keep track
of document-ids through the HashingTF and IDF fitting and transformations.
Thanks.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/order-preservation-with-RDDs-tp22052.html
for text classification,
where (correct me if I'm wrong) there is no built-in mechanism to keep track
of document-ids through the HashingTF and IDF fitting and transformations.
Thanks.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/order-preservation