Re: debug mode

2014-03-26 Thread Mahmood Naderan
Excuse me, I forgot to say that should I use mvndebug for debugging purpose?   Regards, Mahmood On , Mahmood Naderan wrote: Let me state in this way. Using GNU Make, we use "-g -ggdb" to insert debug symbols in the object file. On the other hand, if we use "-O3" it will optimize the code

Re: debug mode

2014-03-26 Thread Mahmood Naderan
Let me state in this way. Using GNU Make, we use "-g -ggdb" to insert debug symbols in the object file. On the other hand, if we use "-O3" it will optimize the code and remove debug symbols. As a result, further debugging is not possible. Now I want to know, if I run "mvn install" will it opti

Re: GC Overhead limit exceed in sequential mode of Mahout Streamingkmeans

2014-03-26 Thread Suneel Marthi
Hi Ma, R u really looking to create 10,000 clusters? Could u first try with 1000 clusters so ur -km would then be =~ 14510 for k = 1000 ? On Wednesday, March 26, 2014 10:33 PM, fx MA XIAOJUN wrote: Dear Suneel, Thank you for your reply. Dear Roland, Thank you for your participation in

RE: GC Overhead limit exceed in sequential mode of Mahout Streamingkmeans

2014-03-26 Thread fx MA XIAOJUN
Dear Suneel, Thank you for your reply. Dear Roland, Thank you for your participation in discussing this problem. My configuration is as followings. -km is set as 14.(1*ln200) mapred.child.java.opts=-Xmx4g Sequential mode does not start mapreduce job. So I don’t know if mapred.

Re: GC Overhead limit exceed in sequential mode of Mahout Streamingkmeans

2014-03-26 Thread Suneel Marthi
... forgot to ask? How many dimensions r u trying to cluster on? Adding a combiner may address this excessive memory usage issue in the reducer (presently not there). On Wednesday, March 26, 2014 8:10 PM, Suneel Marthi wrote: Hi Roland, Could u tell me how many intermediate centroids

Re: GC Overhead limit exceed in sequential mode of Mahout Streamingkmeans

2014-03-26 Thread Suneel Marthi
Hi Roland, Could u tell me how many intermediate centroids were being emitted from the mappers to the single reducer in ur scenario?  You have 6GB allocated for a reducer which is way more than what I can get on my work cluster (only 2GB -:)) . I take it that you have not specified the -rskm

Re: GC Overhead limit exceed in sequential mode of Mahout Streamingkmeans

2014-03-26 Thread Roland von Herget
Hi Suneel, I have the exact same problem with the following values: No of docs: 25.904.599 command line params: -k 1000 -km 17070 Reducer Xmx is 6GB, running in full Map/Reduce mode. Do you have any other idea what to try? Thanks, Roland On Tue, Mar 25, 2014 at 7:13 PM, Suneel Marthi wrote: >