the best practice is to have an Edge/Gateway node, so the there is no
local copy of data. It is also good from a security perspective.
I think my this video can help you understand this better:
https://www.youtube.com/watch?v=t20niJDO1f4
Regards
Gurmukh
On 20/05/16 12:29 AM, Ruhua Jiang wrot
Lukas
That is my understanding as the default strategy is to avoid a network
transfer and place the first replica on the same server that executed the
hdfs client code (i.e. in your case the map or reduce task). If writing to
the 'local' node is not possible, then I believe a random node will be
Your thought is correct. If space is available locally, then it is
automatically stored locally.
On Fri, Jul 26, 2013 at 5:14 PM, Lukas Kairies
wrote:
> Hey,
>
> I am a bit confused about the block placement in Hadoop. Assume that there
> is no replication and a task (map or reduce) writes a file