PowerIterationClustering Benchmark

2016-12-15 Thread Lydia Ickler
Hi all, I have a question regarding the PowerIterationClusteringExample. I have adjusted the code so that it reads a file via „sc.textFile(„path/to/input“)“ which works fine. Now I wanted to benchmark the algorithm using different number of nodes to see how well the implementation scales. As a

distribute work (files)

2016-09-06 Thread Lydia Ickler
Hi, maybe this is a stupid question: I have a list of files. Each file I want to take as an input for a ML-algorithm. All files are independent from another. My question now is how do I distribute the work so that each worker takes a block of files and just runs the algorithm on them one by

Eigenvalue solver

2016-01-12 Thread Lydia Ickler
Hi, I wanted to know if there are any implementations yet within the Machine Learning Library or generally that can efficiently solve eigenvalue problems? Or if not do you have suggestions on how to approach a parallel execution maybe with BLAS or Breeze? Thanks in advance! Lydia Von meinem