Re: Recommended file-system for DataNode

Edward Capriolo Thu, 08 Oct 2009 13:02:29 -0700

On Thu, Oct 8, 2009 at 4:00 PM, Jason Venner <jason.had...@gmail.com> wrote:
> noatime is absolutely essential, I forget to mention it, because it is
> automatic now for me.
>
> I have a fun story about atime, I have some Solaris machines with ZFS file
> systems, and I was doing a find on a 6 level hashed directory tree with
> 250000 leaf nodes.
>
> The find on a cold idle file system was running slowly, and the machine was
> writing at 5-10MB/sec, solaris lets you toggle atime at runtime,
> when I turned it off, the writes went to 0, and the find drastically speeded
> up.
>
> This is very representative of a datanode with many blocks.
>
>
>
> On Thu, Oct 8, 2009 at 12:43 PM, Tom Wheeler <tomwh...@gmail.com> wrote:
>
>> I've used XFS on Silicon Graphics machines and JFS on AIX systems --
>> both were quite fast and extremely reliable, though this long predates
>> my use of Hadoop.
>>
>> To your question, I recently came across a blog that compares
>> performance of several Linux filesystems:
>>
>>   http://log.amitshah.net/2009/04/re-comparing-file-systems.html
>>
>> I'd consider his results anecdotal unless the tests reflect the actual
>> workload of a datanode, but since he's made the code available, you
>> could probably adapt it yourself to get a better measure.
>>
>> On Thu, Oct 8, 2009 at 2:26 PM, Stas Oskin <stas.os...@gmail.com> wrote:
>> > Hi.
>> >
>> > Thanks for the info.
>> >
>> > What about JFS, any idea how well it compares to XFS?
>> >
>> > From what I read, JFS is considered more stable then XFS, but less
>> > performing, so I wonder if this true.
>> >
>> > Also, Ext4 is around the corner and was recently accepted into kernel, so
>> I
>> > wonder if anyone knows about this one.
>>
>> --
>> Tom Wheeler
>> http://www.tomwheeler.com/
>>
>
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>


The good news is its not like you are stuck into the file system you
pick. Assuming you use the normal replication level 3, you can pull
out a datanode, format it's disk with any FS you want and then stick
it back into the cluster. Hadoop should not care after all.  Not
suggesting this...but you could theoretically run each node with a
different file system, look at the performance and say "THIS is the
one for me"

Re: Recommended file-system for DataNode

Reply via email to