Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-15 Thread Ray
there for almost 1 hour. I guess I can only go with random initialization in KMeans. Thanks again for your help. Ray -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16530.html Sent from the Apache Spark User List

Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Ray
: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Ray
observable hanging. Hopefully this provides more information. Thanks. Ray -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16417.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Xiangrui Meng
: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16417.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Ray
.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16428.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Burak Yavuz
:58:03 PM Subject: Re: Spark KMeans hangs at reduceByKey / collectAsMap Hi Xiangrui, The input dataset has 1.5 million sparse vectors. Each sparse vector has a dimension(cardinality) of 9153 and has less than 15 nonzero elements. Yes, if I set num-executors = 200, from the hadoop cluster

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Ray
be an active stage with an incomplete progress bar in the UI. Am I wrong? Thanks, Burak! Ray -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16438.html Sent from the Apache Spark User List mailing

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread DB Tsai
- From: Ray ray-w...@outlook.com To: u...@spark.incubator.apache.org Sent: Tuesday, October 14, 2014 2:58:03 PM Subject: Re: Spark KMeans hangs at reduceByKey / collectAsMap Hi Xiangrui, The input dataset has 1.5 million sparse vectors. Each sparse vector has a dimension(cardinality) of 9153

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Xiangrui Meng
, October 14, 2014 2:58:03 PM Subject: Re: Spark KMeans hangs at reduceByKey / collectAsMap Hi Xiangrui, The input dataset has 1.5 million sparse vectors. Each sparse vector has a dimension(cardinality) of 9153 and has less than 15 nonzero elements. Yes, if I set num-executors = 200, from

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Ray
, it just finished quickly~~ In your test on mnis8m, did you use KMeans++ as initialization mode? How long it takes? Thanks again for your help. Ray -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey-collectAsMap-tp16413p16450

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Xiangrui Meng
. This time, it just finished quickly~~ In your test on mnis8m, did you use KMeans++ as initialization mode? How long it takes? Thanks again for your help. Ray -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-KMeans-hangs-at-reduceByKey