Hi,

How are you running K-Means? What is your k? What is the dimension of your
dataset (columns)? Which Spark version are you using?

Thanks,
Burak

On Mon, Jul 13, 2015 at 2:53 AM, Nirmal Fernando <nir...@wso2.com> wrote:

> Hi,
>
> For a fairly large dataset, 30MB, KMeansModel.computeCost takes lot of
> time (16+ mints).
>
> It takes lot of time at this task;
>
> org.apache.spark.rdd.DoubleRDDFunctions.sum(DoubleRDDFunctions.scala:33)
> org.apache.spark.mllib.clustering.KMeansModel.computeCost(KMeansModel.scala:70)
>
> Can this be improved?
>
> --
>
> Thanks & regards,
> Nirmal
>
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>

Reply via email to