Re: k-means hang without error/warning
Hi Sean, My system is windows 64 bit. I looked into the resource manager, Java is the only process that used about 13% CPU recourse; no disk activity related to Java; only about 6GB memory used out of 56GB in total. My system response very well. I don't think it is a system issue. Thanks, David On Mon, 16 Mar 2015 22:30 Sean Owen so...@cloudera.com wrote: I think you'd have to say more about stopped working. Is the GC thrashing? does the UI respond? is the CPU busy or not? On Mon, Mar 16, 2015 at 4:25 AM, Xi Shen davidshe...@gmail.com wrote: Hi, I am running k-means using Spark in local mode. My data set is about 30k records, and I set the k = 1000. The algorithm starts and finished 13 jobs according to the UI monitor, then it stopped working. The last log I saw was: [Spark Context Cleaner] INFO org.apache.spark.ContextCleaner - Cleaned broadcast 16 There're many similar log repeated, but it seems it always stop at the 16th. If I try to low down the k value, the algorithm will terminated. So I just want to know what's wrong with k=1000. Thanks, David
Re: k-means hang without error/warning
I think you'd have to say more about stopped working. Is the GC thrashing? does the UI respond? is the CPU busy or not? On Mon, Mar 16, 2015 at 4:25 AM, Xi Shen davidshe...@gmail.com wrote: Hi, I am running k-means using Spark in local mode. My data set is about 30k records, and I set the k = 1000. The algorithm starts and finished 13 jobs according to the UI monitor, then it stopped working. The last log I saw was: [Spark Context Cleaner] INFO org.apache.spark.ContextCleaner - Cleaned broadcast 16 There're many similar log repeated, but it seems it always stop at the 16th. If I try to low down the k value, the algorithm will terminated. So I just want to know what's wrong with k=1000. Thanks, David - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: k-means hang without error/warning
How many threads are you allocating while creating the sparkContext? like local[4] will allocate 4 threads. You can try increasing it to a higher number also try setting level of parallelism to a higher number. Thanks Best Regards On Mon, Mar 16, 2015 at 9:55 AM, Xi Shen davidshe...@gmail.com wrote: Hi, I am running k-means using Spark in local mode. My data set is about 30k records, and I set the k = 1000. The algorithm starts and finished 13 jobs according to the UI monitor, then it stopped working. The last log I saw was: [Spark Context Cleaner] INFO org.apache.spark.ContextCleaner - Cleaned broadcast *16* There're many similar log repeated, but it seems it always stop at the 16th. If I try to low down the *k* value, the algorithm will terminated. So I just want to know what's wrong with *k=1000*. Thanks, David
Re: k-means hang without error/warning
I used local[*]. The CPU hits about 80% when there are active jobs, then it drops to about 13% and hand for a very long time. Thanks, David On Mon, 16 Mar 2015 17:46 Akhil Das ak...@sigmoidanalytics.com wrote: How many threads are you allocating while creating the sparkContext? like local[4] will allocate 4 threads. You can try increasing it to a higher number also try setting level of parallelism to a higher number. Thanks Best Regards On Mon, Mar 16, 2015 at 9:55 AM, Xi Shen davidshe...@gmail.com wrote: Hi, I am running k-means using Spark in local mode. My data set is about 30k records, and I set the k = 1000. The algorithm starts and finished 13 jobs according to the UI monitor, then it stopped working. The last log I saw was: [Spark Context Cleaner] INFO org.apache.spark.ContextCleaner - Cleaned broadcast *16* There're many similar log repeated, but it seems it always stop at the 16th. If I try to low down the *k* value, the algorithm will terminated. So I just want to know what's wrong with *k=1000*. Thanks, David
k-means hang without error/warning
Hi, I am running k-means using Spark in local mode. My data set is about 30k records, and I set the k = 1000. The algorithm starts and finished 13 jobs according to the UI monitor, then it stopped working. The last log I saw was: [Spark Context Cleaner] INFO org.apache.spark.ContextCleaner - Cleaned broadcast *16* There're many similar log repeated, but it seems it always stop at the 16th. If I try to low down the *k* value, the algorithm will terminated. So I just want to know what's wrong with *k=1000*. Thanks, David