I like using the new DataFrame APIs on Spark ML, compared to using RDDs in
the older SparkMLlib. But it seems some of the older APIs are missing. In
particular, '*.mllib.clustering.DistributedLDAModel' had two APIs that I
need now:
topDocumentsPerTopic
topTopicsPerDocument
How can I get at the
I want to take advantage of the Breeze linear algebra libraries, built on
netlib-java, used heavily by SparkML. I've found this amazingly
time-consuming to figure out, and have only been able to do so on MacOS. I
want to do same on Linux:
$ uname -a
Linux slc10whv 3.8.13-68.3.4.el6uek.x86_64 #2