Yes and no.  Most OSs/filesystems will get file data to disk within 5
seconds if the files are small.  But if it is written, read, and deleted
quickly it may not ever hit disk.  Applications may request that data is
flushed to disk earlier.

In a Hadoop environment, smaller or medium sized files most likely will get
to disk, but read from page cache in RAM rather than disk.

You can tune the OS to cache more in RAM, for longer, before flushing to
disk if you wish.  For linux look up /proc/sys/vm  (dirty_ratio,
dirty_backround_ratio, and related).

On 5/19/09 7:19 AM, "paula_ta" <paula...@yahoo.com> wrote:

> 
> 
> Is it possible that some intermediate data produced by mappers and written to
> the local file system resides in memory in the file system cache and is
> never flushed to disk ?  Eventually reducers will retrieve this data via
> HTTP - possibly without the data ever being written to disk ?
> 
> thanks
> Paula
> 
> --
> View this message in context:
> http://www.nabble.com/Is-intermediate-data-produced-by-mappers-always-flushed-
> to-disk---tp23617347p23617347.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
> 
> 

Reply via email to