Hi Jaxon!
MapReduce is just an application (one of many including Tez, Spark, Slider
etc.) that runs on Yarn. Each YARN application decides to log whatever it
wants. For MapReduce,
You replication numbers do seem to be on the high. How did you arrive at
those numbers? If you swamp the datanode with too much replication work
than it can do in an iteration (every 3 seconds), things would go bad.
I often check using `ps aux | grep java` all the java processes running
rather
Nishant,
Sorry about the late reply. You may want to check out
https://ambari.apache.org/mail-lists.html to see if the Ambari user list
can answer your question better.
William Watson
Lead Software Engineer
J.D. Power O2O
Yes, all the files passed must pre-exist. In this case, you would need to run
something as follows:
curl -i -X POST
Hi, Wellington
All the source parts are:
-rw-r--r-- hadoop supergroup 2.43 KB 2 32 MB part-01-00-000
-rw-r--r-- hadoop supergroup 21.14 MB 2 32 MB part-02-00-000
-rw-r--r-- hadoop supergroup 22.1 MB 2 32 MB part-04-00-000
-rw-r--r-- hadoop supergroup 22.29 MB 2 32 MB
Hi!
I was trying to implement a Hadoop/Spark audit tool, but l met a problem
that I can’t get the input file location and file name. I can get
username, IP address, time, user command, all of these info from
hdfs-audit.log. But When I submit a MapReduce job, I can’t see input file
location