On Mon, Jan 3, 2011 at 11:40 AM, Brian Bockelman <bbock...@cse.unl.edu>wrote:

> It's not immediately clear to me the size of the benefit versus the costs.
>  Two cases where one normally thinks about direct I/O are:
> 1) The usage scenario is a cache anti-pattern.  This will be true for some
> Hadoop use cases (MapReduce), not true for some others.
>  - http://www.jeffshafer.com/publications/papers/shafer_ispass10.pdf
> 2) The application manages its own cache.  Not applicable.
> Atom processors, which you mention below, will just exacerbate (1) due to
> the small cache size.
>

Actually, assuming you thrash the cache anyway, having a smaller cache can
often be a good thing. ;-)


> All-in-all, doing this specialization such that you don't hurt the general
> case is going to be tough.


For the Hadoop case, the advantages of O_DIRECT would seem to be
comparatively petty to using O_APPEND and/or MMAP (yes, I realize this is
not quite the same as what you are proposing, but it seems close enough for
most cases.. Your best case for a win is when you have reasonably random
access to a file, and then something else that would benefit from more logve
-- 
Chris

Reply via email to