Hi Harsh,

I need access to the data programatically for system automation, and hence
I do not want a monitoring tool but access to the raw data.

I am more than happy to use an exposed function or client program and not
an internal API.

So i am still a bit confused... What is the simplest way to get at this
raw disk usage data programmatically?  Is there a HDFS equivalent of du
and df, or are you suggesting to just run that on the linux OS (which is
perfectly doable).

Cheers,
Ivan


On 10/17/11 9:05 AM, "Harsh J" <ha...@cloudera.com> wrote:

>Uma/Ivan,
>
>The DistributedFileSystem class explicitly is _not_ meant for public
>consumption, it is an internal one. Additionally, that method has been
>deprecated.
>
>What you need is FileSystem#getStatus() if you want the summarized
>report via code.
>
>A job, that possibly runs "du" or "df", is a good idea if you
>guarantee perfect homogeneity of path names in your cluster.
>
>But I wonder, why won't using a general monitoring tool (such as
>nagios) for this purpose cut it? What's the end goal here?
>
>P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I
>see it being cross posted into mr-user, common-user, and common-dev --
>Why?
>
>On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686
><mahesw...@huawei.com> wrote:
>> We can write the simple program and you can call this API.
>>
>> Make sure Hadoop jars presents in your class path.
>> Just for more clarification, DN will send their stats as parts of
>>hertbeats, So, NN will maintain all the statistics about the diskspace
>>usage for the complete filesystem and etc... This api will give you that
>>stats.
>>
>> Regards,
>> Uma
>>
>> ----- Original Message -----
>> From: ivan.nov...@emc.com
>> Date: Monday, October 17, 2011 9:07 pm
>> Subject: Re: Is there a good way to see how full hdfs is
>> To: common-user@hadoop.apache.org, mapreduce-u...@hadoop.apache.org
>> Cc: common-...@hadoop.apache.org
>>
>>> So is there a client program to call this?
>>>
>>> Can one write their own simple client to call this method from all
>>> diskson the cluster?
>>>
>>> How about a map reduce job to collect from all disks on the cluster?
>>>
>>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686"
>>> <mahesw...@huawei.com>wrote:
>>>
>>> >/** Return the disk usage of the filesystem, including total
>>> capacity,>   * used space, and remaining space */
>>> >  public DiskStatus getDiskStatus() throws IOException {
>>> >    return dfs.getDiskStatus();
>>> >  }
>>> >
>>> >DistributedFileSystem has the above API from java API side.
>>> >
>>> >Regards,
>>> >Uma
>>> >
>>> >----- Original Message -----
>>> >From: wd <w...@wdicc.com>
>>> >Date: Saturday, October 15, 2011 4:16 pm
>>> >Subject: Re: Is there a good way to see how full hdfs is
>>> >To: mapreduce-u...@hadoop.apache.org
>>> >
>>> >> hadoop dfsadmin -report
>>> >>
>>> >> On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis
>>> >> <lordjoe2...@gmail.com> wrote:
>>> >> > We have a small cluster with HDFS running on only 8 nodes - I
>>> >> believe that
>>> >> > the partition assigned to hdfs might be getting full and
>>> >> > wonder if the web tools or java api havew a way to look at free
>>> >> space on
>>> >> > hdfs
>>> >> >
>>> >> > --
>>> >> > Steven M. Lewis PhD
>>> >> > 4221 105th Ave NE
>>> >> > Kirkland, WA 98033
>>> >> > 206-384-1340 (cell)
>>> >> > Skype lordjoe_com
>>> >> >
>>> >> >
>>> >> >
>>> >>
>>> >
>>>
>>>
>>
>
>
>
>-- 
>Harsh J
>

Reply via email to