Hello.

I read in a few articles like [1] that we can obtain data block stats from
"historical data access recorder from the NameNode log file" or in another
paper it's stated that frequent data blocks can be determined using
NameNode provided logs.

I searched for related information on hadoop.apache.org but didn't find
anything. I read about job counters, fsimage, edit logs, audit logs... but
nothing related to a metric that represents "frequently accessed data
blocks" of DataNodes.

I'd appreciate any help on whether this kind of stat is being collected by
a component or not.

Thank you


[1] Jia-xuan Wu, Chang-sheng Zhang, Bin Zhang, Peng Wang, "A new
data-grouping-aware dynamic data placement method that take into account
jobs execute frequency for Hadoop", Microprocessors and Microsystems,
Volume 47, Part A, 2016, Pages 161-169

Reply via email to