Kevin wrote:
Thank you for the suggestion. I looked at DFSClient. It appears that
chooseDataNode method decides which data node to connect to. Currently
it chooses the first non-dead data node returned by namenode, which
have sorted the nodes by proximity to the client. However,
chooseDataNode is private, so overriding it seems infeasible. Neither
are the callers of chooseDataNode public or protected.

I need this because I do not want to trust namenode's ordering. For
applications where network congestion is rare, we should let the
client to decide which data node to load from.


dangerous. what happens when network congestion arrives and the apps are out there. Maybe it should be negotiated -namenode provides an ordered list and the client can choose some based on its own measurements. If the name node provides one only, that's the one you get to use

Reply via email to