What's ur value for -km? Based on what you had provided -km should be = 10000 * ln(2000000) = 145090
Try reducing ur no. of clusters to 1000 and -km = 14509 On Tuesday, March 25, 2014 2:45 AM, fx MA XIAOJUN <xiaojun...@fujixerox.co.jp> wrote: I am using Mahout Streamingkmeans in sequential mode. With a dataset of 2000000 objects, 128 variables, I would like to get 10000 clusters. " GC Overhead limit exceed " error occurred. How to set java memory limit for sequential model? Yours Sincerely, Ma