Good point Eli. We should update the docs. I think the point is that the contents of the file aren't guaranteed to be correct until the file is closed (e.g. compressed files, delimited records, etc). Even though files show up in blocks they're almost always unusable until complete (again, without flush).
On Nov 21, 2011, at 6:53 PM, Eli Collins <[email protected]> wrote: > Hey gang, > > The Flume user guide states "Using dfs has some restrictions and > requires some extra setup. Files contents will not become available > until after the sink has been closed. See the Troubleshooting section > for details." > > This isn't accurate. In hdfs data is available to clients as blocks > complete, ie well before the file is closed. This is true even if you > don't hflush the stream. > > Thanks, > Eli
