I agree with you, and I wonder if there is anything that can be done to help
managers look at possible problems in this area
I have two ideas:
1. Add a namenodeIp parameter to hdfs dfsadmin-printTopology to obtain rack
information about the specified namenode.
2. Add debug information to the printTopology method of class DFSAdmin.
However, the command only requests a fixed namenode, and the debug logs of the
other namenode cannot be printed
At 2022-11-10 18:44:19, "Ayush Saxena" <ayush...@gmail.com> wrote:
If some sort of debugging is going on which doubts topological
misconfiguration, you anyway need to check all the namenodes, if one namenode
is misconfigured and if another is not. Maybe the issue won't surface if the
properly configured namenode is the Active namenode at that time, but one
failover can screw things up.
Secondly, checking the topology to triage a potential issue which doubts rack
misconfiguration just by checking Active namenode itself isn't a complete
solution, what if when the issue occurred the present standby namenode was
active then. In such cases anyway you have to check all the Namenodes.
Getting Topology from Individual Namenodes is a doable task for any Admin &
isn't as such difficult. If that wasn't naive to do so, We could have explored
getting Topology from all the namenodes as part of DebugAdmin commands maybe....
-Ayush
On Thu, 10 Nov 2022 at 15:45, 尉雁磊 <tre2...@163.com> wrote:
So what you are saying is that this is a management issue, not a code issue.
Even if the manager has misdeployed the rack perception of namnode, the manager
will not be able to locate the actual problem from the log and will only be
able to check whether the deployment operation is correct。
At 2022-11-10 17:34:37, "Ayush Saxena" <ayush...@gmail.com> wrote:
In a stable cluster, usually all the datanodes report to all the namenodes and
mostly the information would be more or less same in all namenodes. This isn't
data which goes stale you might land up in some mess, and moreover these
aren't user commands but Admin commands, it is pre assumed that the admin would
be having idea about the system and how it behaves, and there are ways to get
this detail from a specific Namenode, it can be done if required, even each
namenode UI gives details about the datanode states and so.
From the code point of view, I don't think it is a good idea to change or
something which is gonna get accepted.
-Ayush
On Thu, 10 Nov 2022 at 13:53, 尉雁磊 <yuyan...@qianxin.com> wrote:
hdfs dfsadmin -printTopology Always get information from this namenode in the
cluster,Whether the namenode is active or standby,I don't think this is normal,
this command should always get information from the active namenode!