Recommended file-system for DataNode

2009-10-08 Thread Stas Oskin
Hi. I'm using the stock Ext3 as the most tested one, but I wonder, has someone ever tried, or even using there days in production another file system, like JFS, XFS or even maybe Ext4? I'm exploring way to boost the performance of DataNodes, and this seems as one of possible venues. Thanks for a

Re: Recommended file-system for DataNode

2009-10-08 Thread Jason Venner
I have used xfs pretty extensively, it seemed to be somewhat faster than ext3. The only trouble we had related to some machines running the PAE 32 bit kernels, where we the filesystems lockup. That is an obscure use case however. Running JBOD with your dfs.data.dir listing a directory on each devi

Re: Recommended file-system for DataNode

2009-10-08 Thread Stas Oskin
Hi. Thanks for the info, question is whether XFS performance justifies switching from the more common Ext3? JBOD is a great approach indeed. Regards. 2009/10/8 Jason Venner > I have used xfs pretty extensively, it seemed to be somewhat faster than > ext3. > > The only trouble we had related t

Re: Recommended file-system for DataNode

2009-10-08 Thread Tom Wheeler
As an aside, there's a short article comparing the two in the latest edition of Linux Journal. It was hardly scientific, but the main points were: - XFS is faster than ext3, especially for large files - XFS is currently unsupported on Red Hat Enterprise, but apparently will be soon. On Thu,

Re: Recommended file-system for DataNode

2009-10-08 Thread Jason Venner
Busy datanodes become bound by the metadata lookup times for the directory and inode entries required to open a block. Anything that optimizes that will help substantially. We are thinking of playing with brtfs, and using a small SSD for our file system metadata, and the spinning disks for the blo

Re: Recommended file-system for DataNode

2009-10-08 Thread Stas Oskin
Hi. Thanks for the info. What about JFS, any idea how well it compares to XFS? >From what I read, JFS is considered more stable then XFS, but less performing, so I wonder if this true. Also, Ext4 is around the corner and was recently accepted into kernel, so I wonder if anyone knows about this

Re: Recommended file-system for DataNode

2009-10-08 Thread paul
Check out the bottom of this page: http://wiki.apache.org/hadoop/DiskSetup noatime is all we've done in our environment. I haven't found it worth the time to optimize further since we're CPU bound in most of our jobs. -paul On Thu, Oct 8, 2009 at 3:26 PM, Stas Oskin wrote: > Hi. > > Thanks

Re: Recommended file-system for DataNode

2009-10-08 Thread Tom Wheeler
I've used XFS on Silicon Graphics machines and JFS on AIX systems -- both were quite fast and extremely reliable, though this long predates my use of Hadoop. To your question, I recently came across a blog that compares performance of several Linux filesystems: http://log.amitshah.net/2009/04/

Re: Recommended file-system for DataNode

2009-10-08 Thread Jason Venner
noatime is absolutely essential, I forget to mention it, because it is automatic now for me. I have a fun story about atime, I have some Solaris machines with ZFS file systems, and I was doing a find on a 6 level hashed directory tree with 25 leaf nodes. The find on a cold idle file system wa

Re: Recommended file-system for DataNode

2009-10-08 Thread Edward Capriolo
On Thu, Oct 8, 2009 at 4:00 PM, Jason Venner wrote: > noatime is absolutely essential, I forget to mention it, because it is > automatic now for me. > > I have a fun story about atime, I have some Solaris machines with ZFS file > systems, and I was doing a find on a 6 level hashed directory tree w

Re: Recommended file-system for DataNode

2009-10-08 Thread Stas Oskin
Hi Jason. Brtfs is cool, I read that it has a 10% better performance then any other FS coming next to it. Can you post here the results of any your findings? Regards. 2009/10/8 Jason Venner > Busy datanodes become bound by the metadata lookup times for the directory > and inode entries requir

Re: Recommended file-system for DataNode

2009-10-08 Thread Stas Oskin
Hi. I head about this option before, but never actually tried it. There is also another option, called "relatime", which described as being more compatible then noatime. Can anyone comment on this? Regards. 2009/10/8 Edward Capriolo > On Thu, Oct 8, 2009 at 4:00 PM, Jason Venner > wrote: > >

Re: Recommended file-system for DataNode

2009-10-08 Thread Edward Capriolo
On Thu, Oct 8, 2009 at 9:15 PM, Stas Oskin wrote: > Hi. > > I head about this option before, but never actually tried it. > > There is also another option, called "relatime", which described as being > more compatible then noatime. > Can anyone comment on this? > > Regards. > > 2009/10/8 Edward Ca

Re: Recommended file-system for DataNode

2009-10-09 Thread stephen mulcahy
paul wrote: Check out the bottom of this page: http://wiki.apache.org/hadoop/DiskSetup Just re-reading that page, two suggestions that may not be appropriate, 1. Reducing reserved space to 0. AFAIK, ext3 needs a certain amount of free space to function properly - the man page for mke2fs sugg

Re: Recommended file-system for DataNode

2009-10-09 Thread Edward Capriolo
On a 1tb disk reducing reserved space from 5 to 2 saves almost 30 gb. Cutting the inodes down saves you some space but not nearly as much. Say 10 gb. The differnce is once you format your disk you can't change the inode numbers. Tunefs can tune reserved blocks while the disk is mounted. I did res

Re: Recommended file-system for DataNode

2009-10-09 Thread Stas Oskin
Hi. AFAIK, this space is reserved for root logs, in case the filesystem is full, so the kernel won't crash. >From what I seen, it has to be only enabled on the root partition, the data partitions it can be safely set to 0. I usually leave the default 5% on root, boot and swap (as the space saving

Re: Recommended file-system for DataNode

2009-10-11 Thread Stas Oskin
Hi. By the way, about the noatime - is it safe just to set this for all partitions used, including / and boot? Thanks. 2009/10/9 Stas Oskin > Hi. > AFAIK, this space is reserved for root logs, in case the filesystem is > full, so the kernel won't crash. > > From what I seen, it has to be only

Re: Recommended file-system for DataNode

2009-10-12 Thread Jason Venner
Unless you are serving mail via imap or pop, it is generally considered safe. On Sun, Oct 11, 2009 at 1:11 AM, Stas Oskin wrote: > Hi. > > By the way, about the noatime - is it safe just to set this for all > partitions used, including / and boot? > > Thanks. > > 2009/10/9 Stas Oskin > > > Hi.

Re: Recommended file-system for DataNode

2009-10-12 Thread Stas Oskin
Hi. Thanks for the advice. Regards. 2009/10/12 Jason Venner > Unless you are serving mail via imap or pop, it is generally considered > safe. > > On Sun, Oct 11, 2009 at 1:11 AM, Stas Oskin wrote: > > > Hi. > > > > By the way, about the noatime - is it safe just to set this for all > > partit