DistributedLDAModel missing APIs in org.apache.spark.ml

2016-03-19 Thread cindymc
I like using the new DataFrame APIs on Spark ML, compared to using RDDs in the older SparkMLlib. But it seems some of the older APIs are missing. In particular, '*.mllib.clustering.DistributedLDAModel' had two APIs that I need now: topDocumentsPerTopic topTopicsPerDocument How can I get at the

Using netlib-java in Spark 1.6 on linux

2016-03-02 Thread cindymc
I want to take advantage of the Breeze linear algebra libraries, built on netlib-java, used heavily by SparkML. I've found this amazingly time-consuming to figure out, and have only been able to do so on MacOS. I want to do same on Linux: $ uname -a Linux slc10whv 3.8.13-68.3.4.el6uek.x86_64 #2