Hi Jaxon! MapReduce is just an application (one of many including Tez, Spark, Slider etc.) that runs on Yarn. Each YARN application decides to log whatever it wants. For MapReduce, https://github.com/apache/hadoop/blob/27a1a5fde94d4d7ea0ed172635c146d594413781/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java#L762 logs which split is being processed. Are you not seeing this message? Perhaps check the log level of the MapTask.
For the other YARN applications, the logging may be different. In any case, for all the frameworks, if the file is on HDFS, the hdfs audit log should have a record. HTH Ravi On Wed, Jul 26, 2017 at 11:27 PM, Jaxon Hu <hujiaxu...@gmail.com> wrote: > Hi! > > I was trying to implement a Hadoop/Spark audit tool, but l met a problem > that I can’t get the input file location and file name. I can get > username, IP address, time, user command, all of these info from > hdfs-audit.log. But When I submit a MapReduce job, I can’t see input file > location neither in Hadoop logs or Hadoop ResourceManager. Does hadoop > have API or log that contains these info through some configuration ?If it > have ,What should I configure? > > Thanks. >