What is the feature dimension? I saw you used 100 partitions. How many cores does your cluster have? -Xiangrui
On Tue, Oct 14, 2014 at 1:51 PM, Ray <ray-w...@outlook.com> wrote: > Hi guys, > > An interesting thing, for the input dataset which has 1.5 million vectors, > if set the KMeans's k_value = 100 or k_value = 50, it hangs as mentioned > above. However, if decrease k_value = 10, the same error still appears in > the log but the application finished successfully, without observable > hanging. > > Hopefully this provides more information. > > Thanks. > > Ray > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16417.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org