Hello, every one.
In fact, I installed mahout 0.9 on pseudo mode of cdh5.0.0-beta-2.
Note that mahout 0.9 may be not able to function on YARN.
So I chose MRv1 when installing cdh5.
Installation of mahout 0.9 was easy.
What I had done is to unzip mahout0.9 distribution and add path to /etc/profile.
: Suneel Marthi [mailto:suneel_mar...@yahoo.com]
Sent: Wednesday, March 19, 2014 9:08 AM
To: fx MA XIAOJUN; user@mahout.apache.org
Subject: Re: reduce is too slow in StreamingKmeans
When dealing with Streaming KMeans, it would be helpful for troubleshooting
purposes if u could provide the values for k
:
What's ur value for -km?
>Based on what you had provided -km should be = 1 * ln(200) =
>145090
>
>Try reducing ur no. of clusters to 1000 and -km = 14509
>
>
>
>
>
>
>
>
>On Tuesday, March 25, 2014 2:45 AM, fx MA XIAOJUN
>wrote:
>
>
I am using Mahout Streamingkmeans in sequential mode.
With a dataset of 200 objects, 128 variables, I would like to get 1
clusters.
" GC Overhead limit exceed " error occurred.
How to set java memory limit for sequential model?
Yours Sincerely,
Ma
As mahout streamingkmeans has no problems in sequential mode,
I would like to try sequential mode.
However, "java.lang.OutofMemoryError" occurs.
I wonder where to set JVM heap size for sequential mode?
Is it the same with mapreduce mode?
-Original Message-----
From: fx
t;mahout kmeans" can run properly.
Is mahout 0.9 compatible with Hadoop 0.20?
-Original Message-
From: Suneel Marthi [mailto:suneel_mar...@yahoo.com]
Sent: Monday, March 17, 2014 6:21 PM
To: fx MA XIAOJUN; user@mahout.apache.org
Subject: Re: reduce is too slow in StreamingKm
Thank you for your quick reply.
As to -km, I thought it was log10, instead of ln. I was wrong...
This time I set -km 14 and run mahout streamingkmeans again.(CDH 5.0 Mrv1,
Mahout 0.8)
The maps run faster than before, but the reduce was still stuck at 76% for ever.
So, I uninstalled mahout 0.