Thank you, Brian.

I found your paper "Using Hadoop as grid storage," and it was very useful.

One thing I did not understand in it is your file usage pattern - do you
deal with small or large files, and do you delete them often enough? My
question was, in part, can you use HDFS as a regular file system with
frequent file deletes? Does it not become fragmented and unreliable?

Thank you,
Mark

On Thu, Dec 2, 2010 at 7:10 AM, Brian Bockelman <bbock...@cse.unl.edu>wrote:

>
> On Dec 2, 2010, at 5:16 AM, Steve Loughran wrote:
>
> > On 02/12/10 03:01, Mark Kerzner wrote:
> >> Hi, guys,
> >>
> >> I see that there is MountableHDFS<
> http://wiki.apache.org/hadoop/MountableHDFS>,
> >> and I know that it works, but my questions are as follows:
> >>
> >>    - How reliable is it for large storage?;
> >
> > Shouldn't be any worse than normal HDFS operations.
> >
> >>    - Is it not hiding the regular design questions - we are dealing with
> >>    NameServers after all, but are trying to use it as a regular file
> system?
> >>    - For example, HDFS is not optimized for many small files that get
> >>    written and deleted, but a mounted system will lure one in this
> direction.
> >
> > Like you say, it's not a conventional posix fs, it hates small files,
> where other things may be better.
>
> I would comment that it's extremely reliable.  There's at least one slow
> memory leak in fuse-dfs that I haven't been able to squash, and I typically
> remount things after a month or two of *heavy* usage.
>
> Across all the nodes in our cluster, we probably do a few billion HDFS
> operations per day over FUSE.
>
> Brian

Reply via email to