Hi Jaonary,

With the current implementation, you need to call Array.slice to make
each row an Array[Double] and cache the result RDD. There is a plan to
support block-wise input data and I will keep you informed.

Best,
Xiangrui

On Tue, Mar 18, 2014 at 2:46 AM, Jaonary Rabarisoa <jaon...@gmail.com> wrote:
> Dear All,
>
> I'm trying to cluster data from native library code with Spark Kmeans||. In
> my native library the data are represented as a matrix (row = number of data
> and col = dimension). For efficiency reason, they are copied into a one
> dimensional scala Array row major wise so after the computation I have a
> RDD[Array[Double]] but the dimension of each array represents a set of data
> instead of the data itself. I need to transfrom these array into
> Array[Array[Double]] before running the KMeans|| algorithm.
>
> How to do this efficiently ?
>
>
>
> Best regards,

Reply via email to