Thank you for your sharing.

Appreciate.

Tim

> On Feb 22, 2015, at 1:23 AM, Jonathan Aquilina <jaquil...@eagleeyet.net> 
> wrote:
> 
> Hi Tim,
> 
> Not sure if this might be of any use in terms of improving overall cluster 
> performance for you, but I hope that it might shed some ideas for you and 
> others.
> 
> https://media.amazonwebservices.com/AWS_Amazon_EMR_Best_Practices.pdf
> 
>  
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> On 2015-02-22 07:57, Tim Chou wrote:
> 
>> Hi Jonathan,
>>  
>> Very useful information. I will look at the ganglia.
>>  
>> However, I do not have the administrative privilege for the cluster. I don't 
>> know if I can install Ganglia in the cluster.
>>  
>> Thank you for your information.
>>  
>> Best,
>> Tim
>> 
>> 2015-02-22 0:53 GMT-06:00 Jonathan Aquilina <jaquil...@eagleeyet.net 
>> <mailto:jaquil...@eagleeyet.net>>:
>> Where I am working we are working on transient cluster (temporary) using 
>> Amazon EMR. When I was reading up on how things work they suggested for 
>> monitoring to use ganglia to monitor memory usage and network usage etc. 
>> That way depending on how things are setup be it using an amazon s3 bucket 
>> for example and pulling data directly into the cluster the network link will 
>> always be saturated to ensure a constant flow of data.
>> 
>> What I am suggesting is potentially looking at ganglia.
>> 
>>  
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>> On 2015-02-22 07:42, Fang Zhou wrote:
>> 
>> Hi Jonathan,
>>  
>> Thank you.
>>  
>> The number of files impact on the memory usage in Namenode.
>>  
>> I just want to get the real memory usage situation in Namenode.
>>  
>> The memory used in heap always changes so that I have no idea about which 
>> value is the right one.
>>  
>> Thanks,
>> Tim
>> 
>> On Feb 22, 2015, at 12:22 AM, Jonathan Aquilina <jaquil...@eagleeyet.net 
>> <mailto:jaquil...@eagleeyet.net>> wrote:
>> 
>> I am rather new to hadoop, but wouldnt the difference be potentially in how 
>> the files are split in terms of size?
>> 
>>  
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>> On 2015-02-21 21:54, Fang Zhou wrote:
>> 
>> Hi All,
>> 
>> I want to test the memory usage on Namenode and Datanode.
>> 
>> I try to use jmap, jstat, proc/pid/stat, top, ps aux, and Hadoop website 
>> interface to check the memory.
>> The values I get from them are different. I also found that the memory 
>> always changes periodically.
>> This is the first thing confused me.
>> 
>> I thought the more files stored in Namenode, the more memory usage in 
>> Namenode and Datanode.
>> I also thought the memory used in Namenode should be larger than the memory 
>> used in each Datanode.
>> However, some results show my ideas are wrong.
>> For example, I test the memory usage of Namenode with 6000 and 1000 files.
>> The "6000" memory is less than "1000" memory from jmap's results. 
>> I also found that the memory usage in Datanode is larger than the memory 
>> used in Namenode.
>> 
>> I really don't know how to get the memory usage in Namenode and Datanode.
>> 
>> Can anyone give me some advices?
>> 
>> Thanks,
>> Tim

Reply via email to