Could it in cluster mode have access to breeze.linalg?
On Fri, Mar 25, 2016 at 8:34 AM -0700, "Trevor Grant" <trevor.d.gr...@gmail.com> wrote: The following code works as expected in 'local' mode (Flink): import breeze.linalg.{DenseVector => BreezeDenseVector} import org.apache.flink.ml.common.LabeledVector import org.apache.flink.ml.math.Vector import org.apache.flink.ml.regression.MultipleLinearRegression val features = 5 val N = 100 val synthDS = env.generateSequence(0, N).map(i => LabeledVector(scala.util.Random.nextDouble, Vector.vectorConverter.convert(BreezeDenseVector.rand(features)))) val mlr = MultipleLinearRegression() mlr.fit(synthDS) val weights = mlr.weightsOption match { case Some(weights) => weights.collect() case None => throw new Exception("Could not calculate the weights.") } println(weights) However when using Flink in stand-alone cluster mode, the job fails with numerous "Class Not Found ..." errors (related to the classes imported). The code does work, so long as the weights aren't collected i.e. no error is thrown until the job is submitted to the cluster. Thoughts? tg Trevor Grant Data Scientist https://github.com/rawkintrevo http://stackexchange.com/users/3002022/rawkintrevo http://trevorgrant.org *"Fortunate is he, who is able to know the causes of things." -Virgil*