your observation is correct. backup node will also download.
If you look at the journey/evolution of hadoop, we had primary, backup
only, checkpointing node and then a generic secondary node.
checking node will do the merge of fsimage and edits
On 25/9/17 5:57 pm, Chang.Wu wrote:
From the
Hi
Can you explain me the job a bit, there are few rpc timeout like at
datanode level, mapper timeouts etc
On 28/9/17 1:47 pm, Demon King wrote:
Hi,
We have finished a yarn application and deploy it to hadoop 2.6.0
cluster. But if one machine in cluster is down. Our application will
h
Well, in actual job the input will be a file.
so, instead of:
echo "bla ble bli bla" | python mapper.py | sort -k1,1 | python reducer.py
you will have:
cat file.txt | python mapper.py | sort -k1,1 | python reducer.py
The file has to be on HDFS (keeping simple, it can be other
filesystems), t