Hi Harsh, I need access to the data programatically for system automation, and hence I do not want a monitoring tool but access to the raw data.
I am more than happy to use an exposed function or client program and not an internal API. So i am still a bit confused... What is the simplest way to get at this raw disk usage data programmatically? Is there a HDFS equivalent of du and df, or are you suggesting to just run that on the linux OS (which is perfectly doable). Cheers, Ivan On 10/17/11 9:05 AM, "Harsh J" <ha...@cloudera.com> wrote: >Uma/Ivan, > >The DistributedFileSystem class explicitly is _not_ meant for public >consumption, it is an internal one. Additionally, that method has been >deprecated. > >What you need is FileSystem#getStatus() if you want the summarized >report via code. > >A job, that possibly runs "du" or "df", is a good idea if you >guarantee perfect homogeneity of path names in your cluster. > >But I wonder, why won't using a general monitoring tool (such as >nagios) for this purpose cut it? What's the end goal here? > >P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I >see it being cross posted into mr-user, common-user, and common-dev -- >Why? > >On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686 ><mahesw...@huawei.com> wrote: >> We can write the simple program and you can call this API. >> >> Make sure Hadoop jars presents in your class path. >> Just for more clarification, DN will send their stats as parts of >>hertbeats, So, NN will maintain all the statistics about the diskspace >>usage for the complete filesystem and etc... This api will give you that >>stats. >> >> Regards, >> Uma >> >> ----- Original Message ----- >> From: ivan.nov...@emc.com >> Date: Monday, October 17, 2011 9:07 pm >> Subject: Re: Is there a good way to see how full hdfs is >> To: common-user@hadoop.apache.org, mapreduce-u...@hadoop.apache.org >> Cc: common-...@hadoop.apache.org >> >>> So is there a client program to call this? >>> >>> Can one write their own simple client to call this method from all >>> diskson the cluster? >>> >>> How about a map reduce job to collect from all disks on the cluster? >>> >>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686" >>> <mahesw...@huawei.com>wrote: >>> >>> >/** Return the disk usage of the filesystem, including total >>> capacity,> * used space, and remaining space */ >>> > public DiskStatus getDiskStatus() throws IOException { >>> > return dfs.getDiskStatus(); >>> > } >>> > >>> >DistributedFileSystem has the above API from java API side. >>> > >>> >Regards, >>> >Uma >>> > >>> >----- Original Message ----- >>> >From: wd <w...@wdicc.com> >>> >Date: Saturday, October 15, 2011 4:16 pm >>> >Subject: Re: Is there a good way to see how full hdfs is >>> >To: mapreduce-u...@hadoop.apache.org >>> > >>> >> hadoop dfsadmin -report >>> >> >>> >> On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis >>> >> <lordjoe2...@gmail.com> wrote: >>> >> > We have a small cluster with HDFS running on only 8 nodes - I >>> >> believe that >>> >> > the partition assigned to hdfs might be getting full and >>> >> > wonder if the web tools or java api havew a way to look at free >>> >> space on >>> >> > hdfs >>> >> > >>> >> > -- >>> >> > Steven M. Lewis PhD >>> >> > 4221 105th Ave NE >>> >> > Kirkland, WA 98033 >>> >> > 206-384-1340 (cell) >>> >> > Skype lordjoe_com >>> >> > >>> >> > >>> >> > >>> >> >>> > >>> >>> >> > > > >-- >Harsh J >