Hi,

> That is quite doable.  Typically, the way that you do this is to buffer the
> data either in memory or on local disk.  Both work fine.  You can munch on
> the data until the cows come home that way.  Hadoop will still schedule your
> tasks and handle failures for you.

Yes, that is an option. Unless the data does not fit onto the local
disk anymore. I could write it to a temp file on hdfs and hope that it
stays "close by" that way. Do you know whether hdfs makes any kind of
guarantees to keep data local to the place where it was created?

Thanks,

Markus

Reply via email to