Question about HDFS capacity and remaining

2009-01-29 Thread Bryan Duxbury
Hey all, I'm currently installing a new cluster, and noticed something a little confusing. My DFS is *completely* empty - 0 files in DFS. However, in the namenode web interface, the reported "capacity" is 3.49 TB, but the "remaining" is 3.25TB. Where'd that .24TB go? There are literally z

Re: Question about HDFS capacity and remaining

2009-01-29 Thread Hairong Kuang
It's taken by non-dfs files. Hairong On 1/29/09 3:23 PM, "Bryan Duxbury" wrote: > Hey all, > > I'm currently installing a new cluster, and noticed something a > little confusing. My DFS is *completely* empty - 0 files in DFS. > However, in the namenode web interface, the reported "capacity" i

Re: Question about HDFS capacity and remaining

2009-01-29 Thread Bryan Duxbury
There are no non-dfs files on the partitions in question. df -h indicates that there is 907GB capacity, but only 853GB remaining, with 200M used. The only thing I can think of is the filesystem overhead. -Bryan On Jan 29, 2009, at 4:06 PM, Hairong Kuang wrote: It's taken by non-dfs files

Re: Question about HDFS capacity and remaining

2009-01-29 Thread Doug Cutting
Ext2 by default reserves 5% of the drive for use by root only. That'd be 45MB of your 907GB capacity which would account for most of the discrepancy. You can adjust this with tune2fs. Doug Bryan Duxbury wrote: There are no non-dfs files on the partitions in question. df -h indicates that t

Re: Question about HDFS capacity and remaining

2009-01-29 Thread Raghu Angadi
Doug Cutting wrote: Ext2 by default reserves 5% of the drive for use by root only. That'd be 45MB of your 907GB capacity which would account for most of the discrepancy. You can adjust this with tune2fs. plus, I think DataNode reports only 98% of the space by default. Raghu. Doug Bryan D

Re: Question about HDFS capacity and remaining

2009-01-30 Thread Bryan Duxbury
Hm, very interesting. Didn't know about that. What's the purpose of the reservation? Just to give root preference or leave wiggle room? If it's not strictly necessary it seems like it would make sense to reduce it to essentially 0%. -Bryan On Jan 29, 2009, at 6:18 PM, Doug Cutting wrote:

Re: Question about HDFS capacity and remaining

2009-01-30 Thread stephen mulcahy
Bryan Duxbury wrote: Hm, very interesting. Didn't know about that. What's the purpose of the reservation? Just to give root preference or leave wiggle room? If it's not strictly necessary it seems like it would make sense to reduce it to essentially 0%. AFAIK It is needed for defragmentation

Re: Question about HDFS capacity and remaining

2009-01-30 Thread Doug Cutting
Bryan Duxbury wrote: Hm, very interesting. Didn't know about that. What's the purpose of the reservation? Just to give root preference or leave wiggle room? I think it's so that, when the disk is full, root processes don't fail, only user processes. So you don't lose, e.g., syslog. With mode

Re: Question about HDFS capacity and remaining

2009-01-30 Thread Brian Bockelman
For what it's worth, our organization did extensive tests on many filesystems benchmarking their performance when they are 90 - 95% full. Only XFS retained most of its performance when it was "mostly full" (ext4 was not tested)... so, if you are thinking of pushing things to the limits, tha

Re: Question about HDFS capacity and remaining

2009-01-30 Thread Edward Capriolo
Very interesting note for a new cluster checklist. Good to tune the file system down from 5%. On a related note some operating systems ::cough:: FreeBSD will report negative disk space when you go over the quota. What does that mean? We run nagios with NRPE to run remote disk checks. We configure

Re: Question about HDFS capacity and remaining

2009-01-30 Thread Bryan Duxbury
Did you publish those results anywhere? On Jan 30, 2009, at 9:56 AM, Brian Bockelman wrote: For what it's worth, our organization did extensive tests on many filesystems benchmarking their performance when they are 90 - 95% full. Only XFS retained most of its performance when it was "mostl

Re: Question about HDFS capacity and remaining

2009-02-01 Thread Sagar Naik
Hi Brian, Is it possible to publish these test results along with configuration options ? -Sagar Brian Bockelman wrote: For what it's worth, our organization did extensive tests on many filesystems benchmarking their performance when they are 90 - 95% full. Only XFS retained most of its per