Hi everyone,
I was trying to understand the process that makes the resources of a
container available again to the ResourceManager.
As far as I can guess from the logs, the AM:
- sends a stop request to the NodeManager for the specific container
- suddenly tells the RM about the release of the
As far as I know the code running in each reducer is the same you specify
in your reduce function, so if you know in advance the features of the data
you want to ignore you can just instruct reducers to do so.
If you are able to tell whether or not to keep an entry at the beginning,
you can filter
Hi Rishabh,
I didn't know anything about Hadoop a few months ago, and I started from
the very beginning. I don't suggest you to start with online documentation,
that is always fragmented, incomplete and sometimes not even up to date.
Also starting by directly using Hadoop is the fastest way to
Hi everyone,
I have a question about the ResourceManager behavior:
when the ResourceManager allocates a container, it takes some time before
the NMToken is sent and then received by the ApplicationMaster.
During this time, it is possible to receive another heartbeat from the AM,
equal to the last
I noticed that too, I think Hadoop keeps the file open all the time and
when you delete it it is just no more able to write on it and doesn't try
to recreate it. Not sure if it's a Log4j problem or an Hadoop one...
yanghaogn, which is the *correct* way to delete the Hadoop logs? I didn't
find