Re: Is there a good way to see how full hdfs is

Mapred Learn Thu, 20 Oct 2011 07:31:57 -0700

Hi,
I have same question regarding the documentation and :
Is there something like this for memory and CPU utilization also ?


Sent from my iPhone

Thanks,
JJ

On Oct 19, 2011, at 5:00 PM, Rajiv Chittajallu <raj...@yahoo-inc.com> wrote:

> ivan.nov...@emc.com wrote on 10/18/11 at 09:23:50 -0700:
>> Cool, is there any documentation on how to use the JMX stuff to get
>> monitoring data?
> 
> I don't know if there is any specific documentation. These are the
> mbeans you might be interested in
> 
> Namenode:
> 
> Hadoop:service=NameNode,name=FSNamesystemState
> Hadoop:service=NameNode,name=NameNodeInfo
> Hadoop:service=NameNode,name=jvm
> 
> JobTracker:
> 
> Hadoop:service=JobTracker,name=JobTrackerInfo
> Hadoop:service=JobTracker,name=QueueMetrics,q=<queuename>
> Hadoop:service=JobTracker,name=jvm
> 
> DataNode:
> Hadoop:name=DataNodeInfo,service=DataNode
> 
> TaskTracker:
> Hadoop:service=TaskTracker,name=TaskTrackerInfo
> 
> You may also want to monitor shuffle_exceptions_caught in 
> Hadoop:service=TaskTracker,name=ShuffleServerMetrics 
> 
>> 
>> Cheers,
>> Ivan
>> 
>> On 10/17/11 6:04 PM, "Rajiv Chittajallu" <raj...@yahoo-inc.com> wrote:
>> 
>>> If you are running > 0.20.204
>>> http://phanpy-nn1.hadoop.apache.org:50070/jmx?qry=Hadoop:service=NameNode,
>>> name=NameNodeInfo
>>> 
>>> 
>>> ivan.nov...@emc.com wrote on 10/17/11 at 09:18:20 -0700:
>>>> Hi Harsh,
>>>> 
>>>> I need access to the data programatically for system automation, and
>>>> hence
>>>> I do not want a monitoring tool but access to the raw data.
>>>> 
>>>> I am more than happy to use an exposed function or client program and not
>>>> an internal API.
>>>> 
>>>> So i am still a bit confused... What is the simplest way to get at this
>>>> raw disk usage data programmatically?  Is there a HDFS equivalent of du
>>>> and df, or are you suggesting to just run that on the linux OS (which is
>>>> perfectly doable).
>>>> 
>>>> Cheers,
>>>> Ivan
>>>> 
>>>> 
>>>> On 10/17/11 9:05 AM, "Harsh J" <ha...@cloudera.com> wrote:
>>>> 
>>>>> Uma/Ivan,
>>>>> 
>>>>> The DistributedFileSystem class explicitly is _not_ meant for public
>>>>> consumption, it is an internal one. Additionally, that method has been
>>>>> deprecated.
>>>>> 
>>>>> What you need is FileSystem#getStatus() if you want the summarized
>>>>> report via code.
>>>>> 
>>>>> A job, that possibly runs "du" or "df", is a good idea if you
>>>>> guarantee perfect homogeneity of path names in your cluster.
>>>>> 
>>>>> But I wonder, why won't using a general monitoring tool (such as
>>>>> nagios) for this purpose cut it? What's the end goal here?
>>>>> 
>>>>> P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I
>>>>> see it being cross posted into mr-user, common-user, and common-dev --
>>>>> Why?
>>>>> 
>>>>> On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686
>>>>> <mahesw...@huawei.com> wrote:
>>>>>> We can write the simple program and you can call this API.
>>>>>> 
>>>>>> Make sure Hadoop jars presents in your class path.
>>>>>> Just for more clarification, DN will send their stats as parts of
>>>>>> hertbeats, So, NN will maintain all the statistics about the diskspace
>>>>>> usage for the complete filesystem and etc... This api will give you
>>>>>> that
>>>>>> stats.
>>>>>> 
>>>>>> Regards,
>>>>>> Uma
>>>>>> 
>>>>>> ----- Original Message -----
>>>>>> From: ivan.nov...@emc.com
>>>>>> Date: Monday, October 17, 2011 9:07 pm
>>>>>> Subject: Re: Is there a good way to see how full hdfs is
>>>>>> To: common-user@hadoop.apache.org, mapreduce-u...@hadoop.apache.org
>>>>>> Cc: common-...@hadoop.apache.org
>>>>>> 
>>>>>>> So is there a client program to call this?
>>>>>>> 
>>>>>>> Can one write their own simple client to call this method from all
>>>>>>> diskson the cluster?
>>>>>>> 
>>>>>>> How about a map reduce job to collect from all disks on the cluster?
>>>>>>> 
>>>>>>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686"
>>>>>>> <mahesw...@huawei.com>wrote:
>>>>>>> 
>>>>>>>> /** Return the disk usage of the filesystem, including total
>>>>>>> capacity,>   * used space, and remaining space */
>>>>>>>> public DiskStatus getDiskStatus() throws IOException {
>>>>>>>>   return dfs.getDiskStatus();
>>>>>>>> }
>>>>>>>> 
>>>>>>>> DistributedFileSystem has the above API from java API side.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Uma
>>>>>>>> 
>>>>>>>> ----- Original Message -----
>>>>>>>> From: wd <w...@wdicc.com>
>>>>>>>> Date: Saturday, October 15, 2011 4:16 pm
>>>>>>>> Subject: Re: Is there a good way to see how full hdfs is
>>>>>>>> To: mapreduce-u...@hadoop.apache.org
>>>>>>>> 
>>>>>>>>> hadoop dfsadmin -report
>>>>>>>>> 
>>>>>>>>> On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis
>>>>>>>>> <lordjoe2...@gmail.com> wrote:
>>>>>>>>>> We have a small cluster with HDFS running on only 8 nodes - I
>>>>>>>>> believe that
>>>>>>>>>> the partition assigned to hdfs might be getting full and
>>>>>>>>>> wonder if the web tools or java api havew a way to look at free
>>>>>>>>> space on
>>>>>>>>>> hdfs
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Steven M. Lewis PhD
>>>>>>>>>> 4221 105th Ave NE
>>>>>>>>>> Kirkland, WA 98033
>>>>>>>>>> 206-384-1340 (cell)
>>>>>>>>>> Skype lordjoe_com
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Harsh J
>>>>> 
>>>> 
>>

Re: Is there a good way to see how full hdfs is

Reply via email to