How is locality really implemented?

Fabio Sat, 10 Jan 2015 22:06:07 -0800

Hi everyone,

I am trying to understand how locality is actually implemented inHadoop, specifically in the Capacity Scheduler.I see that an application has to specify the "location" for a request,that can be a node, a rack, or ANY (*).For this reason I want to ask if an application that wants a resourcerequest to be considered at higher levels of locality, has to submitmore than just one request for the same resource (e.g.: if I want myrequest to be considered also at rack and off-switch level, I will issuea request specifying the node, one specifying the rack, and one withjust *, all of them with the relax locality set to true).This seems to me as a necessity since in the Capacity scheduler code therack local requests (and similarly for the off-switch ones) are obtainedin a way like this:


application.getResourceRequest(priority, *node.getRackName()*)

while if I submitted a single request just for a specific node, evenallowing relaxed locality, the scheduler would not be able to process itat rack level, since it specifically looks for requests made for thecurrent rack (and at switch level too).

Is this correct?

If it is, what's the point of the relax locality parameter? I don't seea possible situation when I would have any request with relax localityset to false (in particular for rack and off-switch levels). Why wouldan application issue a rack-level request with relax locality set to false?


Thanks in advance

Fabio

How is locality really implemented?

Reply via email to