Hey Adrian, If you look into your DN's logs, you can find the "clienttrace" logs that tell you what DFSClient ID asked for what block. MR DFSClients send across their task attempt ID as a string, and you can reuse this to make a trace.
On Fri, Aug 24, 2012 at 3:39 AM, Adrian Suarez <[email protected]> wrote: > I'm trying to modify Hadoop in a way that requires that the DataNode be > aware of the MapReduce job that initiated a block request (e.g. fetching an > input split for a map task). My understanding of the code is limited, but > as far as I can tell, this information doesn't reach the DataNode, which > only sees generic requests made by the DFSClient. So, what I'd like to know > is how requests originating from the TaskTracker make their way to the > DataNode. Any help would be appreciated. > > Adrian -- Harsh J
