On Mon, Jan 3, 2011 at 11:40 AM, Brian Bockelman <bbock...@cse.unl.edu>wrote:
> It's not immediately clear to me the size of the benefit versus the costs. > Two cases where one normally thinks about direct I/O are: > 1) The usage scenario is a cache anti-pattern. This will be true for some > Hadoop use cases (MapReduce), not true for some others. > - http://www.jeffshafer.com/publications/papers/shafer_ispass10.pdf > 2) The application manages its own cache. Not applicable. > Atom processors, which you mention below, will just exacerbate (1) due to > the small cache size. > Actually, assuming you thrash the cache anyway, having a smaller cache can often be a good thing. ;-) > All-in-all, doing this specialization such that you don't hurt the general > case is going to be tough. For the Hadoop case, the advantages of O_DIRECT would seem to be comparatively petty to using O_APPEND and/or MMAP (yes, I realize this is not quite the same as what you are proposing, but it seems close enough for most cases.. Your best case for a win is when you have reasonably random access to a file, and then something else that would benefit from more logve -- Chris