First, Hadoop itself doesn’t have any caching.
Secondly, if it is a mapper only job, then the data doesn’t go through the
network.
So look at somewhere else
From: Avery, John [mailto:jav...@akamai.com]
Sent: Wednesday, December 27, 2017 3:20 PM
To: user@hadoop.apache.org
Subject: Help me
Nevermind. I found my stupid mistake. I didn’t reset a variable…this fact had
escaped me for the past two days.
From: "Avery, John"
Date: Wednesday, December 27, 2017 at 4:20 PM
To: "user@hadoop.apache.org"
Subject: Help me understand hadoop caching
I’m writing a program using the C API for Hadoop. I have a 4-node cluster.
(Cluster was setup according to
https://www.tutorialspoint.com/hadoop/hadoop_multi_node_cluster.htm) Of the 4
nodes, one is the namenode and a datanode, the others are datanodes (with one
being a secondary namenode).